summaryrefslogtreecommitdiffstats
path: root/drivers/dma/idxd/device.c
AgeCommit message (Collapse)AuthorFilesLines
2022-12-28dmaengine: idxd: Do not call DMX TX callbacks during workqueue disableReinette Chatre1-0/+11
On driver unload any pending descriptors are flushed and pending DMA descriptors are explicitly completed: idxd_dmaengine_drv_remove() -> drv_disable_wq() -> idxd_wq_free_irq() -> idxd_flush_pending_descs() -> idxd_dma_complete_txd() With this done during driver unload any remaining descriptor is likely stuck and can be dropped. Even so, the descriptor may still have a callback set that could no longer be accessible. An example of such a problem is when the dmatest fails and the dmatest module is unloaded. The failure of dmatest leaves descriptors with dma_async_tx_descriptor::callback pointing to code that no longer exist. This causes a page fault as below at the time the IDXD driver is unloaded when it attempts to run the callback: BUG: unable to handle page fault for address: ffffffffc0665190 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page Fix this by clearing the callback pointers on the transmit descriptors only when workqueue is disabled. Fixes: 403a2e236538 ("dmaengine: idxd: change MSIX allocation based on per wq activation") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/37d06b772aa7f8863ca50f90930ea2fd80b38fc3.1670452419.git.reinette.chatre@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-12-28dmaengine: idxd: Prevent use after free on completion memoryReinette Chatre1-1/+1
On driver unload any pending descriptors are flushed at the time the interrupt is freed: idxd_dmaengine_drv_remove() -> drv_disable_wq() -> idxd_wq_free_irq() -> idxd_flush_pending_descs(). If there are any descriptors present that need to be flushed this flow triggers a "not present" page fault as below: BUG: unable to handle page fault for address: ff391c97c70c9040 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page The address that triggers the fault is the address of the descriptor that was freed moments earlier via: drv_disable_wq()->idxd_wq_free_resources() Fix the use after free by freeing the descriptors after any possible usage. This is done after idxd_wq_reset() to ensure that the memory remains accessible during possible completion writes by the device. Fixes: 63c14ae6c161 ("dmaengine: idxd: refactor wq driver enable/disable operations") Suggested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/6c4657d9cff0a0a00501a7b928297ac966e9ec9d.1670452419.git.reinette.chatre@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-12-28dmaengine: idxd: Let probe fail when workqueue cannot be enabledReinette Chatre1-2/+1
The workqueue is enabled when the appropriate driver is loaded and disabled when the driver is removed. When the driver is removed it assumes that the workqueue was enabled successfully and proceeds to free allocations made during workqueue enabling. Failure during workqueue enabling does not prevent the driver from being loaded. This is because the error path within drv_enable_wq() returns success unless a second failure is encountered during the error path. By returning success it is possible to load the driver even if the workqueue cannot be enabled and allocations that do not exist are attempted to be freed during driver remove. Some examples of problematic flows: (a) idxd_dmaengine_drv_probe() -> drv_enable_wq() -> idxd_wq_request_irq(): In above flow, if idxd_wq_request_irq() fails then idxd_wq_unmap_portal() is called on error exit path, but drv_enable_wq() returns 0 because idxd_wq_disable() succeeds. The driver is thus loaded successfully. idxd_dmaengine_drv_remove()->drv_disable_wq()->idxd_wq_unmap_portal() Above flow on driver unload triggers the WARN in devm_iounmap() because the device resource has already been removed during error path of drv_enable_wq(). (b) idxd_dmaengine_drv_probe() -> drv_enable_wq() -> idxd_wq_request_irq(): In above flow, if idxd_wq_request_irq() fails then idxd_wq_init_percpu_ref() is never called to initialize the percpu counter, yet the driver loads successfully because drv_enable_wq() returns 0. idxd_dmaengine_drv_remove()->__idxd_wq_quiesce()->percpu_ref_kill(): Above flow on driver unload triggers a BUG when attempting to drop the initial ref of the uninitialized percpu ref: BUG: kernel NULL pointer dereference, address: 0000000000000010 Fix the drv_enable_wq() error path by returning the original error that indicates failure of workqueue enabling. This ensures that the probe fails when an error is encountered and the driver remove paths are only attempted when the workqueue was enabled successfully. Fixes: 1f2bb40337f0 ("dmaengine: idxd: move wq_enable() to device.c") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/e8d8116e5efa0fd14fadc5adae6ffd319f0e5ff1.1670452419.git.reinette.chatre@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-14dmaengine: idxd: Remove linux/msi.h includeThomas Gleixner1-1/+0
Nothing in this file needs anything from linux/msi.h Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Cc: dmaengine@vger.kernel.org Link: https://lore.kernel.org/r/20221113202428.573536003@linutronix.de Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: idxd: fix RO device state error after been disabled/resetFengqian Gao1-6/+14
When IDXD is not configurable, that means its WQ, engine, and group configurations cannot be changed. But it can be disabled and its state should be set as disabled regardless it's configurable or not. Fix this by setting device state IDXD_DEV_DISABLED for read-only device as well in idxd_device_clear_state(). Fixes: cf4ac3fef338 ("dmaengine: idxd: fix lockdep warning on device driver removal") Signed-off-by: Fengqian Gao <fengqian.gao@intel.com> Reviewed-by: Xiaochen Shen <xiaochen.shen@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220930032835.2290-1-fengqian.gao@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: idxd: Fix max batch size for Intel IAAXiaochen Shen1-3/+3
>From Intel IAA spec [1], Intel IAA does not support batch processing. Two batch related default values for IAA are incorrect in current code: (1) The max batch size of device is set during device initialization, that indicates batch is supported. It should be always 0 on IAA. (2) The max batch size of work queue is set to WQ_DEFAULT_MAX_BATCH (32) as the default value regardless of Intel DSA or IAA device during work queue setup and cleanup. It should be always 0 on IAA. Fix the issues by setting the max batch size of device and max batch size of work queue to 0 on IAA device, that means batch is not supported. [1]: https://cdrdv2.intel.com/v1/dl/getContent/721858 Fixes: 23084545dbb0 ("dmaengine: idxd: set max_xfer and max_batch for RO device") Fixes: 92452a72ebdf ("dmaengine: idxd: set defaults for wq configs") Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220930201528.18621-2-xiaochen.shen@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29dmaengine: idxd: add configuration for concurrent batch descriptor processingDave Jiang1-0/+2
Add sysfs knob to allow control of the number of batch descriptors that can be concurrently processed by an engine in the group as a fraction of the Maximum Work Descriptors in Progress value specfied in ENGCAP register. This control knob is part of toggle for QoS control. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220917161222.2835172-6-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29dmaengine: idxd: add configuration for concurrent work descriptor processingDave Jiang1-5/+8
Add sysfs knob to allow control of the number of work descriptors that can be concurrently processed by an engine in the group as a fraction of the Maximum Work Descriptors in Progress value specified in ENGCAP register. This control knob is part of toggle for QoS control. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220917161222.2835172-5-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29dmaengine: idxd: add WQ operation cap restriction supportDave Jiang1-1/+14
DSA 2.0 add the capability of configuring DMA ops on a per workqueue basis. This means that certain ops can be disabled by the system administrator for certain wq. By default, all ops are available. A bitmap is used to store the ops due to total op size of 256 bits and it is more convenient to use a range list to specify which bits are enabled. One of the usage to support this is for VM migration between different iteration of devices. The newer ops are disabled in order to allow guest to migrate to a host that only support older ops. Another usage is to restrict the WQ to certain operations for QoS of performance. A sysfs of ops_config attribute is added per wq. It is only usable when the ops_config bit is set under WQ_CAP register. This means that this attribute will return -EOPNOTSUPP on DSA 1.x devices. The expected input is a range list for the bits per operation the WQ supports. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220917161222.2835172-4-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29dmaengine: idxd: convert ats_dis to a wq flagDave Jiang1-2/+2
Make wq attributes access consistent. Convert ats_dis to wq flag WQ_FLAG_ATS_DISABLE. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220917161222.2835172-2-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29dmaengine: idxd: track enabled workqueues in bitmapJerry Snitselaar1-0/+2
Now that idxd_wq_disable_cleanup() sets the workqueue state to IDXD_WQ_DISABLED, use a bitmap to track which workqueues have been enabled. This will then be used to determine which workqueues should be re-enabled when attempting a software reset to recover from a device halt state. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220928154856.623545-3-jsnitsel@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29dmaengine: idxd: Set wq state to disabled in idxd_wq_disable_cleanup()Jerry Snitselaar1-1/+1
If we are calling idxd_wq_disable_cleanup(), the workqueue should be in a disabled state. So set the workqueue state to IDXD_WQ_DISABLED so that the state reflects that. Currently if there is a device failure, and a software reset is attempted the workqueues will not be re-enabled due to idxd_wq_enable() seeing that state as already being IDXD_WQ_ENABLED. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220928154856.623545-2-jsnitsel@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-07-01dmaengine: idxd: force wq context cleanup on device disable pathDave Jiang1-4/+1
Testing shown that when a wq mode is setup to be dedicated and then torn down and reconfigured to shared, the wq configured end up being dedicated anyays. The root cause is when idxd_device_wqs_clear_state() gets called during idxd_driver removal, idxd_wq_disable_cleanup() does not get called vs when the wq driver is removed first. The check of wq state being "enabled" causes the cleanup to be bypassed. However, idxd_driver->remove() releases all wq drivers. So the wqs goes to "disabled" state and will never be "enabled". By that point, the driver has no idea if the wq was previously configured or clean. So force call idxd_wq_disable_cleanup() on all wqs always to make sure everything gets cleaned up. Reported-by: Tony Zhu <tony.zhu@intel.com> Tested-by: Tony Zhu <tony.zhu@intel.com> Fixes: 0dcfe41e9a4c ("dmanegine: idxd: cleanup all device related bits after disabling device") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220628230056.2527816-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-05-29Merge tag 'dmaengine-5.19-rc1' of ↵Linus Torvalds1-46/+105
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "Nothing special, this includes a couple of new device support and new driver support and bunch of driver updates. New support: - Tegra gpcdma driver support - Qualcomm SM8350, Sm8450 and SC7280 device support - Renesas RZN1 dma and platform support Updates: - stm32 device pause/resume support and updates - DMA memset ops Documentation and usage clarification - deprecate '#dma-channels' & '#dma-requests' bindings - driver updates for stm32, ptdma idsx etc" * tag 'dmaengine-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (87 commits) dmaengine: idxd: make idxd_wq_enable() return 0 if wq is already enabled dmaengine: sun6i: Add support for the D1 variant dmaengine: sun6i: Add support for 34-bit physical addresses dmaengine: sun6i: Do not use virt_to_phys dt-bindings: dma: sun50i-a64: Add compatible for D1 dmaengine: tegra: Remove unused switch case dmaengine: tegra: Fix uninitialized variable usage dmaengine: stm32-dma: add device_pause/device_resume support dmaengine: stm32-dma: rename pm ops before dma pause/resume introduction dmaengine: stm32-dma: pass DMA_SxSCR value to stm32_dma_handle_chan_done() dmaengine: stm32-dma: introduce stm32_dma_sg_inc to manage chan->next_sg dmaengine: stm32-dmamux: avoid reset of dmamux if used by coprocessor dmaengine: qcom: gpi: Add support for sc7280 dt-bindings: dma: pl330: Add power-domains dmaengine: stm32-mdma: use dev_dbg on non-busy channel spurious it dmaengine: stm32-mdma: fix chan initialization in stm32_mdma_irq_handler() dmaengine: stm32-mdma: remove GISR1 register dmaengine: ti: deprecate '#dma-channels' dmaengine: mmp: deprecate '#dma-channels' dmaengine: pxa: deprecate '#dma-channels' and '#dma-requests' ...
2022-05-19dmaengine: idxd: make idxd_wq_enable() return 0 if wq is already enabledDave Jiang1-1/+1
When calling idxd_wq_enable() and wq is already enabled, code should return 0 and indicate function is successful instead of return error code and fail. This should also put idxd_wq_enable() in sync with idxd_wq_disable() where it returns 0 if wq is already disabled. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/165090980906.1378449.1939401700832432886.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-05-16dmaengine: idxd: Remove unnecessary synchronize_irq() before free_irq()Minghao Chi1-1/+0
Calling synchronize_irq() right before free_irq() is quite useless. On one hand the IRQ can easily fire again before free_irq() is entered, on the other hand free_irq() itself calls synchronize_irq() internally (in a race condition free way), before any state associated with the IRQ is freed. Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Link: https://lore.kernel.org/r/20220516115412.1651772-1-chi.minghao@zte.com.cn Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-05-16dmaengine: idxd: skip irq free when wq type is not kernelDave Jiang1-0/+3
Skip wq irq resources freeing when wq type is not kernel since the driver skips the irq alloction during wq enable. Add check in wq type check in idxd_wq_free_irq() to mirror idxd_wq_request_irq(). Fixes: 63c14ae6c161 ("dmaengine: idxd: refactor wq driver enable/disable operations") Reported-by: Tony Zu <tony.zhu@intel.com> Tested-by: Tony Zu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/165176310726.2112428.7474366910758522079.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-05-16dmaengine: idxd: remove redudant idxd_wq_disable_cleanup() callDave Jiang1-1/+0
idxd_wq_device_reset_cleanup() already calls idxd_wq_disable_cleanup(). There is no need to call idxd_wq_disable_cleanup() again in idxd_device_wqs_clear_state(). Remove redudant call from idxd_wq_device_reset_cleanup(). Fixes: 0dcfe41e9a4c ("dmanegine: idxd: cleanup all device related bits after disabling device") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/165231365717.986350.2441351765955825964.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-05-16dmaengine: idxd: free irq before wq type is resetDave Jiang1-1/+1
Call idxd_wq_free_irq() in the drv_disable_wq() function before idxd_wq_reset() is called. Otherwise the wq type is reset and the irq does not get freed. Fixes: 63c14ae6c161 ("dmaengine: idxd: refactor wq driver enable/disable operations") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/165231367316.986407.11001767338124941736.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-05-16dmaengine: idxd: fix lockdep warning on device driver removalDave Jiang1-7/+7
Jacob reported that with lockdep debug turned on, idxd_device_driver removal causes kernel splat from lock assert warning for idxd_device_wqs_clear_state(). Make sure idxd_device_wqs_clear_state() holds the wq lock for each wq when cleaning the wq state. Move the call outside of the device spinlock. Reported-by: Jacob Pan <jacob.jun.pan@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/165231364426.986304.9294302800482492780.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-05-16dmaengine: idxd: Separate user and kernel pasid enablingDave Jiang1-3/+3
The idxd driver always gated the pasid enabling under a single knob and this assumption is incorrect. The pasid used for kernel operation can be independently toggled and has no dependency on the user pasid (and vice versa). Split the two so they are independent "enabled" flags. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/165231431746.986466.5666862038354800551.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-04-22dmaengine: idxd: refactor wq driver enable/disable operationsDave Jiang1-21/+37
Move the core driver operations from wq driver to the drv_enable_wq() and drv_disable_wq() functions. The move should reduce the wq driver's knowledge of the core driver operations and prevent code confusion for future wq drivers. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/165047301643.3841827.11222723219862233060.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-04-20dmaengine: idxd: skip clearing device context when device is read-onlyDave Jiang1-0/+3
If the device shows up as read-only configuration, skip the clearing of the state as the context must be preserved for device re-enable after being disabled. Fixes: 0dcfe41e9a4c ("dmanegine: idxd: cleanup all device related bits after disabling device") Reported-by: Tony Zhu <tony.zhu@intel.com> Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/164971479479.2200566.13980022473526292759.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-04-20dmaengine: idxd: set max_xfer and max_batch for RO deviceDave Jiang1-0/+3
Load the max_xfer_size and max_batch_size values from the values read from registers to the shadow variables. This will allow the read-only device to display the correct values for the sysfs attributes. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/164971507673.2201761.11244446608988838897.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-04-11dmaengine: idxd: don't load pasid config until neededDave Jiang1-14/+52
The driver currently programs the system pasid to the WQ preemptively when system pasid is enabled. Given that a dwq will reprogram the pasid and possibly a different pasid, the programming is not necessary. The pasid_en bit can be set for swq as it does not need pasid programming but needs the pasid_en bit. Remove system pasid programming on device config write. Add pasid programming for kernel wq type on wq driver enable. The char dev driver already reprograms the dwq on ->open() call so there's no change. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/164935607115.1660372.6734518676950372366.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-04-08dmaengine: idxd: fix device cleanup on disableDave Jiang1-2/+1
There are certain parts of WQ that needs to be cleaned up even after WQ is disabled during the device disable. Those are the unchangeable parts for a WQ when the device is still enabled. Move the cleanup outside of WQ state check. Remove idxd_wq_disable_cleanup() inside idxd_wq_device_reset_cleanup() since only the unchangeable parts need to be cleared. Fixes: 0f225705cf65 ("dmaengine: idxd: fix wq settings post wq disable") Reported-by: Tony Zhu <tony.zhu@intel.com> Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/164919561905.1455025.13542366389944678346.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-02-15dmaengine: idxd: restore traffic class defaults after wq resetDave Jiang1-2/+7
When clearing the group configurations, the driver fails to restore the default setting for DSA 1.x based devices. Add defaults in idxd_groups_clear_state() for traffic class configuration. Fixes: ade8a86b512c ("dmaengine: idxd: Set defaults for GRPCFG traffic class") Reported-by: Binuraj Ravindran <binuraj.ravindran@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/164304123369.824298.6952463420266592087.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05dmaengine: idxd: change bandwidth token to read buffersDave Jiang1-13/+12
DSA spec v1.2 has changed the term of "bandwidth tokens" to "read buffers" in order to make the concept clearer. Deprecate bandwidth token naming in the driver and convert to read buffers in order to match with the spec and reduce confusion when reading the spec. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163951338932.2988321.6162640806935567317.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05dmaengine: idxd: fix wq settings post wq disableDave Jiang1-2/+10
By the spec, wq size and group association is not changeable unless device is disabled. Exclude clearing the shadow copy on wq disable/reset. This allows wq type to be changed after disable to be re-enabled. Move the size and group association to its own cleanup and only call it during device disable. Fixes: 0dcfe41e9a4c ("dmanegine: idxd: cleanup all device related bits after disabling device") Reported-by: Lucas Van <lucas.van@intel.com> Tested-by: Lucas Van <lucas.van@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163951291732.2987775.13576571320501115257.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05dmaengine: idxd: change MSIX allocation based on per wq activationDave Jiang1-61/+100
Change the driver where WQ interrupt is requested only when wq is being enabled. This new scheme set things up so that request_threaded_irq() is only called when a kernel wq type is being enabled. This also sets up for future interrupt request where different interrupt handler such as wq occupancy interrupt can be setup instead of the wq completion interrupt. Not calling request_irq() until the WQ actually needs an irq also prevents wasting of CPU irq vectors on x86 systems, which is a limited resource. idxd_flush_pending_descs() is moved to device.c since descriptor flushing is now part of wq disable rather than shutdown(). Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163942149487.2412839.6691222855803875848.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05dmaengine: idxd: embed irq_entry in idxd_wq structDave Jiang1-10/+8
With irq_entry already being associated with the wq in a 1:1 relationship, embed the irq_entry in the idxd_wq struct and remove back pointers for idxe_wq and idxd_device. In the process of this work, clean up the interrupt handle assignment so that there's no decision to be made during submit call on where interrupt handle value comes from. Set the interrupt handle during irq request initialization time. irq_entry 0 is designated as special and is tied to the device itself. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163942148362.2412839.12055447853311267866.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-12-17dmaengine: idxd: add knob for enqcmds retriesDave Jiang1-0/+1
Add a sysfs knob to allow tuning of retries for the kernel ENQCMDS descriptor submission. While on host, it is not as likely that ENQCMDS return busy during normal operations due to the driver controlling the number of descriptors allocated for submission. However, when the driver is operating as a guest driver, the chance of retry goes up significantly due to sharing a wq with multiple VMs. A default value is provided with the system admin being able to tune the value on a per WQ basis. Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163820629464.2702134.7577370098568297574.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-12-17dmaengine: idxd: set defaults for wq configsDave Jiang1-8/+5
Add default values for wq size, max_xfer_size and max_batch_size. These values should provide a general guidance for the wq configuration when the user does not specify any specific values. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528473483.3926048.7950067926287180976.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22dmaengine: idxd: handle interrupt handle revoked eventDave Jiang1-1/+5
"Interrupt handle revoked" is an event that happens when the driver is running on a guest kernel and the VM is migrated to a new machine. The device will trigger an interrupt that signals to the guest driver that the interrupt handles need to be replaced. The misc irq thread function calls a helper function to handle the event. The function uses the WQ percpu_ref to quiesce the kernel submissions. It then replaces the interrupt handles by requesting interrupt handle command for each I/O MSIX vector. Once the handle is updated, the driver will unblock the submission path to allow new submissions. The submitter will attempt to acquire a percpu_ref before submission. When the request fails, it will wait on the wq_resurrect 'completion'. The driver does anticipate the possibility of descriptors being submitted before the WQ percpu_ref is killed. If a descriptor has already been submitted, it will return with incorrect interrupt handle status. The descriptor will be re-submitted with the new interrupt handle on the completion path. For descriptors with incorrect interrupt handles, completion interrupt won't be triggered. At the completion of the interrupt handle refresh, the handling function will call idxd_int_handle_refresh_drain() to issue drain descriptors to each of the wq with associated interrupt handle. The drain descriptor will have interrupt request set but without completion record. This will ensure all descriptors with incorrect interrupt completion handle get drained and a completion interrupt is triggered for the guest driver to process them. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Co-Developed-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528420189.3925689.18212568593220415551.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22dmaengine: idxd: create locked version of idxd_quiesce() callDave Jiang1-1/+9
Add a locked version of idxd_quiesce() call so that the quiesce can be called with a lock in situations where the lock is not held by the caller. In the driver probe/remove path, the lock is already held, so the raw version can be called w/o locking. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528418980.3925689.5841907054957931211.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22dmaengine: idxd: int handle management refactoringDave Jiang1-0/+8
Attach int_handle to irq_entry. This removes the separate management of int handles and reduces the confusion of interating through int handles that is off by 1 count. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528417065.3925689.11505755433684476288.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-28dmaengine: idxd: cleanup completion record allocationDave Jiang1-17/+5
According to core-api/dma-api-howto.rst, the address from dma_alloc_coherent is gauranteed to align to the smallest PAGE_SIZE order. That supercedes the 64B/32B alignment requirement of the completion record. Remove alignment adjustment code. Tested-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163517396063.3484297.7494385225280705372.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-25dmaengine: idxd: reconfig device after device reset commandDave Jiang1-0/+2
Device reset clears the MSIXPERM table and the device registers. Re-program the MSIXPERM table and re-enable the error interrupts post reset. Fixes: 745e92a6d816 ("dmaengine: idxd: idxd: move remove() bits for idxd 'struct device' to device.c") Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163054188513.2853562.12077053294595278181.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-25dmaengine: idxd: remove kernel wq type set when load configurationDave Jiang1-2/+0
Remove setting of wq type on guest kernel during configuration load on RO device config. The user will set the kernel wq type and this setting based on config is not necessary. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163474724511.2607444.1876715711451990426.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-18dmaengine: idxd: check GENCAP config support for gencfg registerDave Jiang1-1/+1
DSA spec 1.2 has moved the GENCFG register under the GENCAP configuration support with respect to writability. Add check in driver before writing to GENCFG register. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163406171896.1303830.11217958011385656998.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-01dmaengine: idxd: move out percpu_ref_exit() to ensure it's outside submissionDave Jiang1-1/+0
percpu_ref_tryget_live() is safe to call as long as ref is between init and exit according to the function comment. Move percpu_ref_exit() so it is called after the dma channel is no longer valid to ensure this holds true. Fixes: 93a40a6d7428 ("dmaengine: idxd: add percpu_ref to descriptor submission path") Suggested-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163294293832.914350.10326422026738506152.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-29dmaengine: idxd: remove interrupt disable for dev_lockDave Jiang1-19/+12
The spinlock is not being used in hard interrupt context. There is no need to disable irq when acquiring the lock. The interrupt thread handler also is not in bottom half context, therefore we can also remove disabling of the bh. Convert all dev_lock acquisition to plain spin_lock() calls. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162984026772.1939166.11504067782824765879.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-29dmaengine: idxd: remove interrupt disable for cmd_lockDave Jiang1-11/+8
The cmd_lock spinlock is not being used in hard interrupt context. There is no need to disable irq when acquiring the lock. Convert all cmd_lock acquisition to plain spin_lock() calls. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162984027930.1939209.15758413737332339204.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-29dmaengine: idxd: fix setting up priv mode for dwqDave Jiang1-1/+28
DSA spec says WQ priv bit is 0 if the Privileged Mode Enable field of the PCI Express PASID capability is 0 and pasid is enabled. Make sure that the WQCFG priv field is set correctly according to usage type. Reject config if setting up kernel WQ type and no support. Also add the correct priv setup for a descriptor. Fixes: 484f910e93b4 ("dmaengine: idxd: fix wq config registers offset programming") Cc: Ramesh Thomas <ramesh.thomas@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162939084657.903168.14160019185148244596.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-25dmaengine: idxd: set descriptor allocation size to threshold for swqDave Jiang1-2/+2
Since submission is sent to limited portal, the actual wq size for shared wq is set by the threshold rather than the wq size. When the wq type is shared, set the allocated descriptors to the threshold. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162827151733.3459223.3829837172226042408.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-06dmaengine: idxd: clear block on fault flag when clear wqDave Jiang1-0/+1
The block on fault flag is not cleared when we disable or reset wq. This causes it to remain set if the user does not clear it on the next configuration load. Add clear of flag in dxd_wq_disable_cleanup() routine. Fixes: da32b28c95a7 ("dmaengine: idxd: cleanup workqueue config after disabling") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162803023553.3086015.8158952172068868803.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-28dmanegine: idxd: add software command statusDave Jiang1-3/+19
Enabling device and wq returns standard errno and that does not provide enough details to indicate what exactly failed. The hardware command status is only 8bits. Expand the command status to 32bits and use the upper 16 bits to define software errors to provide more details on the exact failure. Bit 31 will be used to indicate the error is software set as the driver is using some of the spec defined hardware error as well. Cc: Ramesh Thomas <ramesh.thomas@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162681373579.1968485.5891788397526827892.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-28dmaengine: idxd: rotate portal address for better performanceDave Jiang1-0/+1
The device submission portal is on a 4k page and any of those 64bit aligned address on the page can be used for descriptor submission. By rotating the offset through the 4k range and prevent successive writes to the same MMIO address, performance improvement is observed through testing. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162681372446.1968485.10634280461681015569.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-21dmaengine: idxd: move dsa_drv support to compatible modeDave Jiang1-0/+1
The original architecture of /sys/bus/dsa invented a scheme whereby a single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled all device types and internally routed them to different different drivers. Those internal drivers were invisible to userspace. With the idxd driver transitioned to a proper bus device-driver model, the legacy behavior needs to be preserved due to it being exposed to user space via sysfs. Create a compat driver to provide the legacy behavior for /sys/bus/dsa/drivers/dsa. This should satisfy user tool accel-config v3.2 or ealier where this behavior is expected. If the distro has a newer accel-config then the legacy mode does not need to be enabled. When the compat driver binds the device (i.e. dsa0) to the dsa driver, it will be bound to the new idxd_drv. The wq device (i.e. wq0.0) will be bound to either the dmaengine_drv or the user_drv. The dsa_drv becomes a routing mechansim for the new drivers. It will not support additional external drivers that are implemented later. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637468705.744545.4399080971745974435.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-21dmaengine: idxd: create user driver for wq 'device'Dave Jiang1-14/+0
The original architecture of /sys/bus/dsa invented a scheme whereby a single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled all device types and internally routed them to different drivers. Those internal drivers were invisible to userspace. Now, as /sys/bus/dsa wants to grow support for alternate drivers for a given device, for example vfio-mdev instead of kernel-internal-dmaengine, a proper bus device-driver model is needed. The first step in that process is separating the existing omnibus/implicit "dsa" driver into proper individual drivers registered on /sys/bus/dsa. Establish the idxd_user_drv driver that controls the enabling and disabling of the wq and also register and unregister a char device to allow user space to mmap the descriptor submission portal. The cdev related bits are moved to the cdev driver probe/remove and out of the drv_enabe/disable_wq() calls. These bits are exclusive to the cdev operation and not part of the generic enable/disable of the wq device. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637467578.744545.10203997610072341376.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>