summaryrefslogtreecommitdiffstats
path: root/drivers/misc/habanalabs/habanalabs.h
AgeCommit message (Collapse)AuthorFilesLines
2019-05-24habanalabs: halt debug engines on user process closeOmer Shpigelman1-0/+2
This patch fix a potential bug where a user's process has closed unexpectedly without disabling the debug engines. In that case, the debug engines might continue running but because the user's MMU mappings are going away, we will get page fault errors. This behavior is also opposed to the general rule where nothing runs on the device after the user process closes. The patch stops the debug H/W engines upon process termination and thus makes sure nothing runs on the device after the process goes away. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-30habanalabs: increase timeout if working with simulatorDalit Ben Zoor1-1/+6
Where there is a spike in the CPU consumption, it may cause random failures in the C/I since the KMD timeout for CPU and/or QMAN0 jobs expires and it stops communicating to the simulator. This commit fixes it by increasing timeout on polling functions if working with simulator. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-01habanalabs: remove redundant member from parser structDalit Ben Zoor1-3/+0
use_virt_addr member was used for telling whether to treat the addresses in the CB as virtual during parsing. We disabled it only when calling the parser from the driver memset device function, and since this call had been removed, it should always be enabled. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-01habanalabs: Manipulate DMA addresses in ASIC functionsTomer Tayar1-6/+4
Routing device accesses to the host memory requires the usage of a base offset, which is canceled by the iATU just before leaving the device. The value of the base offset might be distinctive between different ASIC types. The manipulation of the addresses is currently used throughout the driver code, and one should be aware to it whenever providing a host memory address to the device. This patch removes this manipulation from the driver common code, and moves it to the ASIC specific functions that are responsible for host memory allocation/mapping. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-01habanalabs: rename functions to improve code readabilityOded Gabbay1-15/+15
This patch renames four functions in the ASIC-specific functions section, so it will be easier to differentiate them from the generic kernel functions with the same name. This will help in future code reviews, to make sure we don't use the kernel functions directly. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-28habanalabs: Use single pool for CPU accessible host memoryTomer Tayar1-0/+12
The device's CPU accessible memory on host is managed in a dedicated pool, except for 2 regions - Primary Queue (PQ) and Event Queue (EQ) - which are allocated from generic DMA pools. Due to address length limitations of the CPU, the addresses of all these memory regions must have the same MSBs starting at bit 40. This patch modifies the allocation of the PQ and EQ to be also from the dedicated pool, to ensure compliance with the limitation. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-28habanalabs: return old dram bar address upon changeOded Gabbay1-2/+3
This patch changes the ASIC interface function that changes the DRAM bar window. The change is to return the old address that the DRAM bar pointed to instead of an error code. This simplifies the code that use this function (mainly in debugfs) to restore the bar to the old setting. This is also needed for easier support in future ASICs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-25habanalabs: rename restore to ctx_switch when appropriateOded Gabbay1-8/+9
This patch only does renaming of certain variables and structure members, and their accompanied comments. This is done to better reflect the actions these variables and members represent. There is no functional change in this patch. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-22habanalabs: use ASIC functions interface for rreg/wregOded Gabbay1-6/+26
This patch slightly changes the macros of RREG32 and WREG32, which are used when reading or writing from registers. Instead of directly calling a function in the common code from these macros, the new code calls a function from the ASIC functions interface. This change allows us to share much more code between real ASICs and simulators, which in turn reduces the maintenance burden and the chances for forgetting to port code between the ASIC files. The patch also implements the hl_poll_timeout macro, instead of calling the generic readl_poll_timeout macro. This is required to allow use of this macro in the simulator files. As a result from this change, more functions in goya.c are shared with the simulator and therefore, should not be defined as static. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-10habanalabs: Cancel pr_fmt() definition dependency on includes orderTomer Tayar1-2/+0
pr_fmt() should be defined before including linux/printk.h, either directly or indirectly, in order to avoid redefinition of the macro. Currently the macro definition is in habanalabs.h, which is included in many files, and that makes the addition/reorder of includes to be prone to compilation errors. This patch cancels this dependency by defining the macro only in the few source files that use it. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-04habanalabs: ASIC_AUTO_DETECT enum value is redundantOded Gabbay1-5/+3
This patch removes the enum value of ASIC_AUTO_DETECT because we can use the validity of the pdev variable to know whether we have a real device or a simulator. For a real device, we detect the asic type from the device ID while for a simulator, the simulator code calls create_hdev() with the specified ASIC type. Set ASIC_INVALID as the first option in the enum to make sure that no other enum value will receive the value 0 (which indicates a non-existing entry in the simulator array). Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-02habanalabs: refactoring in goya.cOded Gabbay1-4/+0
This patch does some refactoring in goya.c to make code more reusable between goya code and the goya simulator code (which is not upstreamed). In addition, the patch removes some dead functions from goya.c which are not used by the current upstream code Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-01habanalabs: add new IOCTL for debug, tracing and profilingOmer Shpigelman1-0/+25
Habanalabs ASICs use the ARM coresight infrastructure to support debug, tracing and profiling of neural networks topologies. Because the coresight is configured using register writes and reads, and some of the registers hold sensitive information (e.g. the address in the device's DRAM where the trace data is written to), the user must go through the kernel driver to configure this mechanism. This patch implements the common code of the IOCTL and calls the ASIC-specific function for the actual H/W configuration. The IOCTL supports configuration of seven coresight components: ETR, ETF, STM, FUNNEL, BMON, SPMU and TIMESTAMP The user specifies which component he wishes to configure and provides a pointer to a structure (located in its process space) that contains the relevant configuration. The common code copies the relevant data from the user-space to kernel space and then calls the ASIC-specific function to do the H/W configuration. After the configuration is done, which is usually composed of several IOCTL calls depending on what the user wanted to trace, the user can start executing the topology. The trace data will be written to the user's area in the device's DRAM. After the tracing operation is complete, and user will call the IOCTL again to disable the tracing operation. The user also need to read values from registers for some of the components (e.g. the size of the trace data in the device's DRAM). In that case, the user will provide a pointer to an "output" structure in user-space, which the IOCTL code will fill according the to selected component. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-24habanalabs: add device status option to INFO IOCTLDalit Ben Zoor1-0/+1
This patch adds a new opcode to INFO IOCTL that returns the device status. This will allow users to query the device status in order to avoid sending command submissions while device is in reset. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-07habanalabs: keep track of the device's dma maskOded Gabbay1-1/+4
This patch refactors the code that is responsible to set the DMA mask for the device. Upon each change of the dma mask, the driver will save the new value that was set. This is needed in order to make sure we don't try to increase the mask a second time, in case we failed in the first time. This is especially relevant for Power machines, as that may cause a change in configuration of the TVT which will break the device. Goya will first try to set the device's dma mask to 39 bits, so that the memory that is allocated on the host machine for communication with the device's cpu will be in a bus address which is lower then 39 bits. Later, Goya will try to increase that mask to 48 bits, but only if setting the mask to 39 bits was successful. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-02-24habanalabs: add MMU shadow mappingOmer Shpigelman1-9/+15
This patch adds shadow mapping to the MMU module. The shadow mapping allows traversing the page table in host memory rather reading each PTE from the device memory. It brings better performance and avoids reading from invalid device address upon PCI errors. Only at the end of map/unmap flow, writings to the device are performed in order to sync the H/W page tables with the shadow ones. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-07habanalabs: Add a printout with the name of a busy engineTomer Tayar1-1/+7
Print the name of a busy engine when checking if a device is idle. The change is done mainly to help a user to pinpoint problems in his topology's recipe. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-05habanalabs: Move PCI code into common fileTomer Tayar1-0/+20
Move duplicated PCI-related code from ASIC-specific files into the common pci.c file. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-04habanalabs: Move device CPU code into common fileTomer Tayar1-0/+17
This patch moves the code that is responsible of the communication vs. the F/W to a dedicated file. This will allow us to share the code between different ASICs. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-03habanalabs: perform accounting for active CSOded Gabbay1-5/+8
This patch adds accounting for active CS. Active means that the CS was submitted to the H/W queues and was not completed yet. This is necessary to support suspend operation. Because the device will be reset upon suspend, we can only suspend after all active CS have been completed. Hence, we need to perform accounting on their number. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-05habanalabs: fix MMU number of pages calculationOmer Shpigelman1-4/+4
The requested allocation size is 64bit, hence the number of requested pages and the total requested size should 64bit as well. This patch fixes all places where these are treated as 32bit. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-02-28habanalabs: disable CPU access on timeoutsOded Gabbay1-0/+2
This patch provides a workaround for a bug in the F/W where the response time for a request from KMD may take more then 100ms. This could cause the queue between KMD and the F/W to get out of sync. The WA is to: 1. Increase the timeout of ALL requests to 1s. 2. In case a request isn't answered in time, mark the state as "cpu_disabled" and prevent sending further requests from KMD to the F/W. This will eventually lead to a heartbeat failure and hard reset of the device. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-28habanalabs: add MMU DRAM default page mappingOmer Shpigelman1-1/+11
This patch provides a workaround for a H/W bug in Goya, where access to RAZWI from TPC can cause PCI completion timeout. The WA is to use the device MMU to map any unmapped DRAM memory to a default page in the DRAM. That way, the TPC will never reach RAZWI upon accessing a bad address in the DRAM. When a DRAM page is mapped by the user, its default mapping is overwritten. Once that page is unmapped, the MMU driver will map that page to the default page. To help debugging, the driver will set the default page area to 0x99 on device initialization. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-27habanalabs: make functions static or declare themOded Gabbay1-2/+0
This patch fixes the below sparse warnings by either making the functions static or by adding a declaration in the relevant header file. In addition, the patch removes goya_mmap completely as it doesn't add any additional benefit. Fixes the following sparse warnings: drivers/misc/habanalabs/habanalabs_drv.c:24:1: warning: symbol 'hl_devs_idr' was not declared. Should it be static? drivers/misc/habanalabs/habanalabs_drv.c:25:1: warning: symbol 'hl_devs_idr_lock' was not declared. Should it be static? drivers/misc/habanalabs/memory.c:1451:5: warning: symbol 'hl_vm_ctx_init_with_ranges' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:396:5: warning: symbol 'goya_send_pci_access_msg' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:417:5: warning: symbol 'goya_pci_bars_map' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:557:6: warning: symbol 'goya_reset_link_through_bridge' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:774:5: warning: symbol 'goya_early_fini' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:857:6: warning: symbol 'goya_late_fini' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:971:5: warning: symbol 'goya_sw_fini' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:1233:5: warning: symbol 'goya_init_cpu_queues' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:2914:5: warning: symbol 'goya_suspend' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:2939:5: warning: symbol 'goya_resume' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:2952:5: warning: symbol 'goya_mmap' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:2957:5: warning: symbol 'goya_cb_mmap' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:2973:6: warning: symbol 'goya_ring_doorbell' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3063:6: warning: symbol 'goya_flush_pq_write' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3068:6: warning: symbol 'goya_dma_alloc_coherent' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3074:6: warning: symbol 'goya_dma_free_coherent' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3080:6: warning: symbol 'goya_get_int_queue_base' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3138:5: warning: symbol 'goya_send_job_on_qman0' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3295:5: warning: symbol 'goya_test_queue' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3417:6: warning: symbol 'goya_dma_pool_zalloc' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3426:6: warning: symbol 'goya_dma_pool_free' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3432:6: warning: symbol 'goya_cpu_accessible_dma_pool_alloc' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3448:6: warning: symbol 'goya_cpu_accessible_dma_pool_free' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3458:5: warning: symbol 'goya_dma_map_sg' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3467:6: warning: symbol 'goya_dma_unmap_sg' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:3473:5: warning: symbol 'goya_get_dma_desc_list_size' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4210:5: warning: symbol 'goya_parse_cb_no_mmu' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4261:5: warning: symbol 'goya_parse_cb_no_ext_quque' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4294:5: warning: symbol 'goya_cs_parser' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4307:6: warning: symbol 'goya_add_end_of_cb_packets' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4334:5: warning: symbol 'goya_context_switch' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4426:6: warning: symbol 'goya_restore_phase_topology' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4460:5: warning: symbol 'goya_debugfs_read32' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4510:5: warning: symbol 'goya_debugfs_write32' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4738:6: warning: symbol 'goya_handle_eqe' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:4836:6: warning: symbol 'goya_get_events_stat' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:5075:5: warning: symbol 'goya_send_heartbeat' was not declared. Should it be static? drivers/misc/habanalabs/goya/goya.c:5253:5: warning: symbol 'goya_get_eeprom_data' was not declared. Should it be static? Reported-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-27habanalabs: allow memory allocations larger than 4GBOded Gabbay1-1/+1
This patch increase the size field in the uapi structure of the Memory IOCTL from 32-bit to 64-bit. This is to allow the user to allocate and/or map memory in chunks that are larger then 4GB. Goya's device memory (DRAM) can be up to 16GB, and for certain topologies, the user may want an allocation that is larger than 4GB. This change doesn't break current user-space because there was a "pad" field in the uapi structure right after the size field. Changing the size field to be 64-bit and removing the pad field maintains compatibility with current user-space. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add debugfs supportOded Gabbay1-0/+189
This patch adds debugfs support to the driver. It allows the user-space to display information that is contained in the internal structures of the driver, such as: - active command submissions - active user virtual memory mappings - number of allocated command buffers It also enables the user to perform reads and writes through Goya's PCI bars. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: implement INFO IOCTLOded Gabbay1-0/+2
This patch implements the INFO IOCTL. That IOCTL is used by the user to query information that is relevant/needed by the user in order to submit deep learning jobs to Goya. The information is divided into several categories, such as H/W IP, Events that happened, DDR usage and more. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add virtual memory and MMU modulesOmer Shpigelman1-2/+187
This patch adds the Virtual Memory and MMU modules. Goya has an internal MMU which provides process isolation on the internal DDR. The internal MMU also performs translations for transactions that go from Goya to the Host. The driver is responsible for allocating and freeing memory on the DDR upon user request. It also provides an interface to map and unmap DDR and Host memory to the device address space. The MMU in Goya supports 3-level and 4-level page tables. With 3-level, the size of each page is 2MB, while with 4-level the size of each page is 4KB. In the DDR, the physical pages are always 2MB. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add command submission moduleOded Gabbay1-0/+275
This patch adds the main flow for the user to submit work to the device. Each work is described by a command submission object (CS). The CS contains 3 arrays of command buffers: One for execution, and two for context-switch (store and restore). For each CB, the user specifies on which queue to put that CB. In case of an internal queue, the entry doesn't contain a pointer to the CB but the address in the on-chip memory that the CB resides at. The driver parses some of the CBs to enforce security restrictions. The user receives a sequence number that represents the CS object. The user can then query the driver regarding the status of the CS, using that sequence number. In case the CS doesn't finish before the timeout expires, the driver will perform a soft-reset of the device. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add device reset supportOded Gabbay1-0/+52
This patch adds support for doing various on-the-fly reset of Goya. The driver supports two types of resets: 1. soft-reset 2. hard-reset Soft-reset is done when the device detects a timeout of a command submission that was given to the device. The soft-reset process only resets the engines that are relevant for the submission of compute jobs, i.e. the DMA channels, the TPCs and the MME. The purpose is to bring the device as fast as possible to a working state. Hard-reset is done in several cases: 1. After soft-reset is done but the device is not responding 2. When fatal errors occur inside the device, e.g. ECC error 3. When the driver is removed Hard-reset performs a reset of the entire chip except for the PCI controller and the PLLs. It is a much longer process then soft-reset but it helps to recover the device without the need to reboot the Host. After hard-reset, the driver will restore the max power attribute and in case of manual power management, the frequencies that were set. This patch also adds two entries to the sysfs, which allows the root user to initiate a soft or hard reset. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add sysfs and hwmon supportOded Gabbay1-0/+102
This patch add the sysfs and hwmon entries that are exposed by the driver. Goya has several sensors, from various categories such as temperature, voltage, current, etc. The driver exposes those sensors in the standard hwmon mechanism. In addition, the driver exposes a couple of interfaces in sysfs, both for configuration and for providing status of the device or driver. The configuration attributes is for Power Management: - Automatic or manual - Frequency value when moving to high frequency mode - Maximum power the device is allowed to consume The rest of the attributes are read-only and provide the following information: - Versions of the various firmwares running on the device - Contents of the device's EEPROM - The device type (currently only Goya is supported) - PCI address of the device (to allow user-space to connect between /dev/hlX to PCI address) - Status of the device (operational, malfunction, in_reset) - How many processes are open on the device's file Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add event queue and interruptsOded Gabbay1-2/+38
This patch adds support for receiving events from Goya's control CPU and for receiving MSI-X interrupts from Goya's DMA engines and CPU. Goya's PCI controller supports up to 8 MSI-X interrupts, which only 6 of them are currently used. The first 5 interrupts are dedicated for Goya's DMA engine queues. The 6th interrupt is dedicated for Goya's control CPU. The DMA queue will signal its MSI-X entry upon each completion of a command buffer that was placed on its primary queue. The driver will then mark that CB as completed and free the related resources. It will also update the command submission object which that CB belongs to. There is a dedicated event queue (EQ) between the driver and Goya's control CPU. The EQ is located on the Host memory. The control CPU writes a new entry to the EQ for various reasons, such as ECC error, MMU page fault, Hot temperature. After writing the new entry to the EQ, the control CPU will trigger its dedicated MSI-X entry to signal the driver that there is a new entry in the EQ. The driver will then read the entry and act accordingly. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add h/w queues moduleOded Gabbay1-1/+173
This patch adds the H/W queues module and the code to initialize Goya's various compute and DMA engines and their queues. Goya has 5 DMA channels, 8 TPC engines and a single MME engine. For each channel/engine, there is a H/W queue logic which is used to pass commands from the user to the H/W. That logic is called QMAN. There are two types of QMANs: external and internal. The DMA QMANs are considered external while the TPC and MME QMANs are considered internal. For each external queue there is a completion queue, which is located on the Host memory. The differences between external and internal QMANs are: 1. The location of the queue's memory. External QMANs are located on the Host memory while internal QMANs are located on the on-chip memory. 2. The external QMAN write an entry to a completion queue and sends an MSI-X interrupt upon completion of a command buffer that was given to it. The internal QMAN doesn't do that. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add basic Goya h/w initializationOded Gabbay1-0/+16
This patch adds the basic part of Goya's H/W initialization. It adds code that initializes Goya's internal CPU, various registers that are related to internal routing, scrambling, workarounds for H/W bugs, etc. It also initializes Goya's security scheme that prevents the user from abusing Goya to steal data from the host, crash the host, change Goya's F/W, etc. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add command buffer moduleOded Gabbay1-0/+86
This patch adds the command buffer (CB) module, which allows the user to create and destroy CBs and to map them to the user's process address-space. A command buffer is a memory blocks that reside in DMA-able address-space and is physically contiguous so it can be accessed by the device without MMU translation. The command buffer memory is allocated using the coherent DMA API. When creating a new CB, the IOCTL returns a handle of it, and the user-space process needs to use that handle to mmap the buffer to get a VA in the user's address-space. Before destroying (freeing) a CB, the user must unmap the CB's VA using the CB handle. Each CB has a reference counter, which tracks its usage in command submissions and also its mmaps (only a single mmap is allowed). The driver maintains a pool of pre-allocated CBs in order to reduce latency during command submissions. In case the pool is empty, the driver will go to the slow-path of allocating a new CB, i.e. calling dma_alloc_coherent. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add context and ASID modulesOded Gabbay1-0/+71
This patch adds two modules - ASID and context. Each user process that opens a device's file must have at least one context before it is able to "work" with the device. Each context has its own device address-space and contains information about its runtime state (its active command submissions). To have address-space separation between contexts, each context is assigned a unique ASID, which stands for "address-space id". Goya supports up to 1024 ASIDs. Currently, the driver doesn't support multiple contexts. Therefore, the user doesn't need to actively create a context. A "primary context" is created automatically when the user opens the device's file. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add basic Goya supportOded Gabbay1-0/+137
This patch adds a basic support for the Goya device. The code initializes the device's PCI controller and PCI bars. It also initializes various S/W structures and adds some basic helper functions. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add skeleton driverOded Gabbay1-0/+131
This patch adds the habanalabs skeleton driver. The driver does nothing at this stage except very basic operations. It contains the minimal code to insmod and rmmod the driver and to create a /dev/hlX file per PCI device. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>