summaryrefslogtreecommitdiffstats
path: root/drivers/scsi/lpfc/lpfc_nvme.c
AgeCommit message (Collapse)AuthorFilesLines
2018-03-14scsi: lpfc: make several unions static, fix non-ANSI prototypeColin Ian King1-4/+4
There are several unions that are local to the source and do not need to be in global scope, so make them static. Also add in a missing void parameter to functions lpfc_nvme_cmd_template and lpfc_nvmet_cmd_template to clean up non-ANSI warning. Cleans up sparse warnings: drivers/scsi/lpfc/lpfc_nvme.c:68:19: warning: symbol 'lpfc_iread_cmd_template' was not declared. Should it be static? drivers/scsi/lpfc/lpfc_nvme.c:69:19: warning: symbol 'lpfc_iwrite_cmd_template' was not declared. Should it be static? drivers/scsi/lpfc/lpfc_nvme.c:70:19: warning: symbol 'lpfc_icmnd_cmd_template' was not declared. Should it be static? drivers/scsi/lpfc/lpfc_nvme.c:74:24: warning: non-ANSI function 'lpfc_tsend_cmd_template' was not declared. Should it be static? drivers/scsi/lpfc/lpfc_nvmet.c:78:19: warning: symbol 'lpfc_treceive_cmd_template' was not declared. Should it be static? drivers/scsi/lpfc/lpfc_nvmet.c:79:19: warning: symbol 'lpfc_trsp_cmd_template' was not declared. Should it be static? drivers/scsi/lpfc/lpfc_nvmet.c:83:25: warning: non-ANSI function declaration of function 'lpfc_nvmet_cmd_template' Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12scsi: lpfc: Streamline NVME Initiator WQE setupJames Smart1-127/+200
To reduce latency when initializing WQE content, create templates for the most common wqes. This reduces the number of operations taken to set the content. It's not a lot of speed up, but every bit helps. This patch updates the NVME initiator path. [mkp: fixed typo] Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12scsi: lpfc: Code cleanup for 128byte wqe data typeJames Smart1-13/+12
The driver is very sloppy about the WQE structure passed between routines. The base struct type is a 64byte wqe. But in many routines they typecast and access 128byte wqes. There were a couple of cases in the past (corrected already) where the typecasts were incorrectly done and the 64byte buffer was accessed as a 128 byte buffer. Clean this up by properly declaring wqe's as 128byte wqe's and removing the typecasts. 64byte wqes are considered a subset of the 128byte wqes. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22scsi: lpfc: Work around NVME cmd iu SGL typeJames Smart1-15/+36
The hardware offload for NVME commands was created when the FC-NVME standard was setting SGL Descriptor Type to SGL Data Block Descriptor (0h) and SGL Descriptor Sub Type to Address (0h). A late change in NVMe-over-Fabrics obsoleted these values, creating a transport SGL descriptor type with new values to go into these fields. For initial hardware support, in order to be compliant to the spec, use host-supplied cmd IU buffers instead of the adapter generated values. Later hardware will correct this. Add a module parameter to override this offload disablement if looking for lowest latency. This is reasonable as nothing in FC-NVME uses the SQE SGL values. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22scsi: lpfc: Fix nvme embedded io length on new hardwareJames Smart1-1/+1
Newer hardware more strictly enforces buffer lenghts, causing an mis-set value to be identified. Older hardware won't catch it. The difference is benign on old hardware. Set the right embedded buffer length for nvme ios. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22scsi: lpfc: Add embedded data pointers for enhanced performanceJames Smart1-0/+18
The current driver isn't taking advantage of a performance hint whereby the initial data buffer descriptor can be placed in the WQE as well as the SGL. Add the logic to detect support for the feature and to use it when supported. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12scsi: lpfc: Update 11.4.0.7 modified files for 2018 CopyrightJames Smart1-1/+1
Updated Copyright in files updated 11.4.0.7 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12scsi: lpfc: Fix nonrecovery of NVME controller after cable swap.James Smart1-11/+13
In a test that is doing large numbers of cable swaps on the target, the nvme controllers wouldn't reconnect. During the cable swaps, the targets n_port_id would change. This information was passed to the nvme-fc transport, in the new remoteport registration. However, the nvme-fc transport didn't update the n_port_id value in the remoteport struct when it reused an existing structure. Later, when a new association was attempted on the remoteport, the driver's NVME LS routine would use the stale n_port_id from the remoteport struct to address the LS. As the device is no longer at that address, the LS would go into never never land. Separately, the nvme-fc transport will be corrected to update the n_port_id value on a re-registration. However, for now, there's no reason to use the transports values. The private pointer points to the drivers node structure and the node structure is up to date. Therefore, revise the LS routine to use the drivers data structures for the LS. Augmented the debug message for better debugging in the future. Also removed a duplicate if check that seems to have slipped in. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12scsi: lpfc: Fix IO failure during hba reset testing with nvme io.James Smart1-2/+2
A stress test repeatedly resetting the adapter while performing io would eventually report I/O failures and missing nvme namespaces. The driver was setting the nvmefc_fcp_req->private pointer to NULL during the IO completion routine before upcalling done(). If the transport was also running an abort for that IO, the driver would fail the abort with message 6140. Failing the abort is not allowed by the nvme-fc transport, as it mandates that the io must be returned back to the transport. As that does not happen, the transport controller delete has an outstanding reference and can't complete teardown. The NULL-ing of the private pointer should be done only when the io is considered complete. It's complete when the adapter returns the exchange with the "exchange busy" flag clear. Move the NULL'ing of the structure to the done case. This leaves the io contexts set while it is busy and until the subsequent XRI_ABORTED completion which returns the exchange is received. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20scsi: lpfc: Beef up stat counters for debugJames Smart1-4/+39
If log verbose in not turned on, its hard to tell when certain error paths get hit. Add stats counters and corresponding logic to debugfs/sysfs to aid understanding what paths were traversed. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20scsi: lpfc: Fix infinite wait when driver unregisters a remote NVME port.James Smart1-84/+51
When unregistering a remote port the lpfc driver would eventually wait for the remoteport_unreg done callback. But the driver never completed the io aborts that would allow the connections to terminate thus the unreg done callback was never issued. Turns out the coding style of the driver allowed for the wait to occur on the same cpu that the deferred isr is called on. The blocking for the wait, blocked the isr, and as the isr didn't run, the io aborts wouldn't finish. Turns out there was never a good reason to block waiting for the unreg done in the first place. The driver can continue execution and the ref counting within the driver will do the right thing. Resolve by removing the wait and patching up a few cases where the ref counting didn't look right - mainly cases where the remote port comes back before the aborts had completed and the unreg done had been called. Additionally, a few places which used pointer values to guide driver actions weren't protected by lock, so correct those. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20scsi: lpfc: Fix random heartbeat timeouts during heavy IOJames Smart1-21/+45
NVME targets appear to randomly disconnect from the initiator when running heavy IO. The error is due to the host aggregate (across all controllers) io load was beyond the maximum exchange count for nvme on the adapter. The driver was properly returning a resource busy status, but the io load was so great heartbeat commands would be bounced and not have a successful retry within the fuzz amount for the nvme heartbeat (yes, a very high io load!). Thus the target was terminating the controller due to a keep alive failure. Resolve by reserving a few exchanges (by counters) which can be used when the adapter is out of normal exchanges and the command is a NVME heartbeat command. As counters are used, while the reserved command is outstanding, as soon as any other exchange completes, the counters are adjusted and the reserved count is replenished. The heartbeat completes execution in a normal fashion. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04scsi: lpfc: small sg cnt cleanupJames Smart1-1/+2
The logic for sg_seg_cnt is a bit convoluted. This patch tries to clean up a couple of areas, especially around the +2 and +1 logic. This patch: - Cleans up the lpfc_sg_seg_cnt attribute to specify a real minimum rather than making the minimum be whatever the default is. - Removes the hardcoding of +2 (for the number of elements we use in a sgl for cmd iu and rsp iu) and +1 (an additional entry to compensate for nvme's reduction of io size based on a possible partial page) logic in sg list initialization. In the case where the +1 logic is referenced in host and target io checks, use the values set in the transport template as that value was properly set. There can certainly be more done in this area and it will be addressed in combined host/target driver effort. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04scsi: lpfc: Fix driver handling of nvme resources during unloadJames Smart1-11/+85
During driver unload, the driver may crash due to NULL pointers. The NULL pointers were due to the driver not protecting itself sufficiently during some of the teardown paths. Additionally, the driver was not waiting for and cleanup up nvme io resources. As such, the driver wasn't making the callbacks to the transport, stalling the transports association teardown. This patch waits for io clean up before tearding down and adds checks for possible NULL pointers. Cc: <stable@vger.kernel.org> # 4.12+ Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04scsi: lpfc: Fix crash during driver unload with running nvme trafficJames Smart1-0/+14
When the driver is unloading, the nvme transport could be in the process of submitting new requests, will send abort requests to terminate associations, or may make LS-related requests. The driver's abort and request entry points currently is ignorant of the unloading state and is starting the requests even though the infrastructure to complete them continues to teardown. Change the entry points for new requests to check whether unloading and if so, reject the requests. Abort routines check unloading, and if so, noop the request. An abort is noop'd as the teardown paths are already aborting/terminating the io outstanding at the time the teardown initiated. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04scsi: lpfc: Correct driver deregistrations with host nvme transportJames Smart1-6/+110
The driver's interaction with the host nvme transport has been incorrect for a while. The driver did not wait for the unregister callbacks (waited only 5 jiffies). Thus the driver may remove objects that may be referenced by subsequent abort commands from the transport, and the actual unregister callback was effectively a noop. This was especially problematic if the driver was unloaded. The driver now waits for the unregister callbacks, as it should, before continuing with teardown. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04scsi: lpfc: correct port registrations with nvme_fcJames Smart1-1/+2
The driver currently registers any remote port that has NVME support. It should only be registering target ports. Register only target ports. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-11-14Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-63/+111
Pull SCSI updates from James Bottomley: "This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas, megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor updates. There's no major behaviour change or additions to the core in all of this, so the potential for regressions should be small (biggest potential being in the scsi error handler changes)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits) scsi: lpfc: Fix hard lock up NMI in els timeout handling. scsi: mpt3sas: remove a stray KERN_INFO scsi: mpt3sas: cleanup _scsih_pcie_enumeration_event() scsi: aacraid: use timespec64 instead of timeval scsi: scsi_transport_fc: add 64GBIT and 128GBIT port speed definitions scsi: qla2xxx: Suppress a kernel complaint in qla_init_base_qpair() scsi: mpt3sas: fix dma_addr_t casts scsi: be2iscsi: Use kasprintf scsi: storvsc: Avoid excessive host scan on controller change scsi: lpfc: fix kzalloc-simple.cocci warnings scsi: mpt3sas: Update mpt3sas driver version. scsi: mpt3sas: Fix sparse warnings scsi: mpt3sas: Fix nvme drives checking for tlr. scsi: mpt3sas: NVMe drive support for BTDHMAPPING ioctl command and log info scsi: mpt3sas: Add-Task-management-debug-info-for-NVMe-drives. scsi: mpt3sas: scan and add nvme device after controller reset scsi: mpt3sas: Set NVMe device queue depth as 128 scsi: mpt3sas: Handle NVMe PCIe device related events generated from firmware. scsi: mpt3sas: API's to remove nvme drive from sml scsi: mpt3sas: API 's to support NVMe drive addition to SML ...
2017-10-16scsi: lpfc: Fix a precedence bug in lpfc_nvme_io_cmd_wqe_cmpl()Dan Carpenter1-1/+1
The ! has higher precedence than the & operation. I've added parenthesis so this works as intended. Fixes: 952c303b329c ("scsi: lpfc: Ensure io aborts interlocked with the target.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02scsi: lpfc: correct nvme sg segment count checkJames Smart1-2/+2
The internal cfg flag is actually smaller, by 1 (for a partial page sge), than the sg list maintained by the driver. Thus the check on sg segments errored out when it shouldn't have Ensure the check is +1 Note: having a value that is less than what it really is is bogus. Correcting it now would be a significant rework. Add this item to the list to be refactored in the merge with efct. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02scsi: lpfc: Fix oops of nvme host during driver unload.Dick Kennedy1-0/+8
When running NVME io as a NVME host, if the driver is unloaded there would be oops in lpfc_sli4_issue_wqe. When unloading, controllers are torn down and the transport initiates set_property commands to reset the controller and issues aborts to terminate existing io. The drivers nvme abort and fcp io submit routines needed to recognize the driver is unloading and fail the new requests. It didn't, resulting in the oops. Revise the ls and fcp io submit routines to detect the unloading state and properly handle their cleanup. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02scsi: lpfc: Ensure io aborts interlocked with the target.Dick Kennedy1-25/+34
Before releasing nvme io back to the io stack for possible retry on other paths, ensure the io termination is interlocked with the target device by ensuring the entire ABTS-LS protocol is complete. Additionally, FC-NVME ABTS-LS protocol does not use RRQ. Remove RRQ behavior from ABTS-LS. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02scsi: lpfc: Fix crash in lpfc_nvme_fcp_io_submit during LIPDick Kennedy1-0/+10
The driver is seeing a NULL pointer in lpfc_nvme_fcp_io_submit. This was ultimately due to a transport AER being sent on a terminated controller, thus some of the values were not set. In case we're in a system without a corrected transport and in case a race condition occurs where we enter the routine as the teardown is happening in a separate thread, validate the parameters before starting the io. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02scsi: lpfc: Reduce log spew on controller reconnectsJames Smart1-3/+3
There are several log messages that report abnormal terminations that by default are marked warn. These are typically the result of failures due to invalid controller state or abort completions. They are all natural when a controller resets. Unfortunately, as they are logged by default, it makes the admin very concerned. Convert the messages to Info. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02scsi: lpfc: Make ktime sampling more accurateDick Kennedy1-11/+23
Need to make ktime samples more accurate If ktime is turned on in the middle of an IO, the max calculation could be misleading. Base sampling on the start time of the IO as opposed to ktime_on. Make ISR ktime timestamps be from when CQE is read instead of EQE. Added additional sanity checks when deciding whether to accept an IO sample or not. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02scsi: lpfc: Fix lpfc nvme host rejecting IO with Not Ready messageDick Kennedy1-18/+28
In a link bounce scenario, a condition can occur where the discovery engine swaps an ndlp structure (address change for an nport). While the swap was successfully executed by the discovery engine, the driver did not properly detect a change in the ndlp bound to the nvme rport. This error resulted in the nvme host transport issuing an IO to the correct nvme rport, but the lpfc driver addressed a ndlp with an NLP_UNUSED status and failed the io. This resulting it it looking like there were missing namespaces and applications failed due to io errors. To fix, in lpfc_nvme_register_rport, rework the "rebind" case to break the nvme rport<->ndlp association when the ndlp already has an nrport. Then rebind the rport to the correct ndlp data and backpointers. [mkp: typo] Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-25scsi: lpfc: Cocci spatch "pool_zalloc-simple"Thomas Meyer1-4/+3
Use *_pool_zalloc rather than *_pool_alloc followed by memset with 0. Found by coccinelle spatch "api/alloc/pool_zalloc-simple.cocci" Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-25lpfc: remove use of FC-specific error codesJames Smart1-1/+1
The lpfc driver uses the FC-specific error when it needed to return an error to the FC-NVME transport. Convert to use a generic value instead. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-24scsi: lpfc: remove console log clutterJames Smart1-1/+1
Change hw queue binding messages to info - not error. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24scsi: lpfc: Fix bad sgl reposting after 2nd adapter resetDick Kennedy1-2/+9
Port issue was fixed, the hbacmd reset would take more than 8 minutes to complete. There were conflicting NVME SGL posting/reposting responsibilities between lpfc_online()/lpfc_sli4_hba_setup() and lpfc_nvme_create_localport(). The lpfc_online() causes a REPOST on existing NVME SGLs which is not released during the fc port reset. However, lpfc_nvme_create_localport() wants to allocate new NVME buffers and post them. Both cancelled out each other which had a side effect of hosing the mailbox handling that was used to remove the sgl lists - causing multiple 60s mbx timeouts. Fix by preserving all SGL lists over the fc port reset. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24scsi: lpfc: Correct return error codes to align with nvme_fc transportDick Kennedy1-5/+4
Modify driver return error codes to align with host nvme transport. Driver isn't returning Exxx error codes to properly reflect out of resource or connectivity conditions (-EBUSY), yet there were hard error conditions returning -EBUSY. Ensure the following situations return the proper return code: - Temporary failures or temporary resource availability: -EBUSY - Connectivity issues: -ENODEV All others are treated as hard errors and return an -Exxx value that indicates the type of error. Also, lpfc_sli4_issue_wqe() was modified to not translate error from -Exxx to WQE state. This allows lpfc_nvme_fcp_io_submit() routine to just return whatever -E value was returned from other routines. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24scsi: lpfc: Fix oops when NVME Target is discovered in a nonNVME environmentDick Kennedy1-0/+3
lpfc oops when it discovers a NVME target but is configured for SCSI only operation. Oops is in lpfc_nvme_register_port+0x33/0x300. The localport is not valid so it should not have been referenced. Added validity check for localport Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-07scsi: lpfc: Replace PCI pool old APIRomain Perier1-3/+3
The PCI pool API is deprecated. This commit replaces the PCI pool old API by the appropriate function with the DMA pool API. It also updates some comments, accordingly. Signed-off-by: Romain Perier <romain.perier@collabora.com> Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: lpfc: Fix nvme io stoppage after link bounceJames Smart1-34/+1
On link down, transport is calling driver to abort outstanding ios. Driver erroneously rejects the abort if the port indicates it isn't logged in - which will be the case after the link down. Thus, the io can't clean up. This prevents reconnection at the transport level. Fix by allowing abort to proceed. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: lpfc: Fix counters so outstandng NVME IO count is accurateJames Smart1-6/+13
NVME FC counters don't reflect actual results Since counters are not atomic, or protected by a lock, the values often get screwed up. Make them atomic, like NVMET. Fix up sysfs and debugfs display accordingly Added Outstanding IOs to stats display Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: lpfc: Fix transition nvme-i rport handling to nport only.James Smart1-18/+0
As the devloss API was implemented in the nvmei driver, an evaluation of the nvme transport and the lpfc driver showed dual management of the rports. This creates a bug possibility when the thread count and SAN size increases. The nvmei driver code was based on a very early transport and was not revisited until the devloss API was introduced. Remove the listhead in the driver's rport data structure and the listhead in the driver's lport data structure. Remove all rport_list traversal. Convert the driver to use the nrport (nvme rport) pointer that is now NULL or nonNULL depending on a devloss action. Convert debugfs and nvme_info in sysfs to use the fc_nodes list in the vport. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: lpfc: Add nvme initiator devloss supportJames Smart1-89/+52
Add nvme initiator devloss support The existing implementation was based on no devloss behavior in the transport (e.g. immediate teardown) so code didn't properly handle delayed nvme rport device unregister calls. In addition, the driver was not correctly cycling the rport port role for each register-unregister-reregister process. This patch does the following: Rework the code to properly handle rport device unregister calls and potential re-allocation of the remoteport structure if the port comes back in under dev_loss_tmo. Correct code that was incorrectly cycling the rport port role for each register-unregister-reregister process. Prep the code to enable calling the nvme_fc transport api to dynamically update dev_loss_tmo when the scsi sysfs interface changes it. Memset the rpinfo structure in the registration call to enforce "accept nvme transport defaults" in the registration call. Driver parameters do influence the dev_loss_tmo transport setting dynamically. Simplifies the register function: the driver was incorrectly searching its local rport list to determine resume or new semantics, which is not valid as the transport already handles this. The rport was resumed if the rport handed back matches the ndlp->nrport pointer. Otherwise, devloss fired and the ndlp's nrport is NULL. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-25lpfc: Fix memory corruption of the lpfc_ncmd->list pointersJames Smart1-6/+11
lpfc was changing the private pointer that is set/maintained by the nvme_fc transport. This caused two issues: a) the transport, on teardown may erroneous attempt to free whatever address was set; and b) lfpc uses any value set in lpfc_nvme_fcp_abort() and assumes its a valid io request. Correct issue by properly defining a context structure for lpfc. Lpfc also updated to clear the private context structure on io completion. Since this bug caused scrutiny of the way lpfc moves local request structures between lists, also cleaned up list_del()'s to list_del_inits()'s. This is a nvme-specific bug. The patch was cut against the linux-block tree, for-4.12/block tree. It should be pulled in through that tree. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-04-24Update ABORT processing for NVMET.James Smart1-13/+32
The driver with nvme had this routine stubbed. Right now XRI_ABORTED_CQE is not handled and the FC NVMET Transport has a new API for the driver. Missing code path, new NVME abort API Update ABORT processing for NVMET There are 3 new FC NVMET Transport API/ template routines for NVMET: lpfc_nvmet_xmt_fcp_release This NVMET template callback routine called to release context associated with an IO This routine is ALWAYS called last, even if the IO was aborted or completed in error. lpfc_nvmet_xmt_fcp_abort This NVMET template callback routine called to abort an exchange that has an IO in progress nvmet_fc_rcv_fcp_req When the lpfc driver receives an ABTS, this NVME FC transport layer callback routine is called. For this case there are 2 paths thru the driver: the driver either has an outstanding exchange / context for the XRI to be aborted or not. If not, a BA_RJT is issued otherwise a BA_ACC NVMET Driver abort paths: There are 2 paths for aborting an IO. The first one is we receive an IO and decide not to process it because of lack of resources. An unsolicated ABTS is immediately sent back to the initiator as a response. lpfc_nvmet_unsol_fcp_buffer lpfc_nvmet_unsol_issue_abort (XMIT_SEQUENCE_WQE) The second one is we sent the IO up to the NVMET transport layer to process, and for some reason the NVME Transport layer decided to abort the IO before it completes all its phases. For this case there are 2 paths thru the driver: the driver either has an outstanding TSEND/TRECEIVE/TRSP WQE or no outstanding WQEs are present for the exchange / context. lpfc_nvmet_xmt_fcp_abort if (LPFC_NVMET_IO_INP) lpfc_nvmet_sol_fcp_issue_abort (ABORT_WQE) lpfc_nvmet_sol_fcp_abort_cmp else lpfc_nvmet_unsol_fcp_issue_abort lpfc_nvmet_unsol_issue_abort (XMIT_SEQUENCE_WQE) lpfc_nvmet_unsol_fcp_abort_cmp Context flags: LPFC_NVMET_IOP - his flag signifies an IO is in progress on the exchange. LPFC_NVMET_XBUSY - this flag indicates the IO completed but the firmware is still busy with the corresponding exchange. The exchange should not be reused until after a XRI_ABORTED_CQE is received for that exchange. LPFC_NVMET_ABORT_OP - this flag signifies an ABORT_WQE was issued on the exchange. LPFC_NVMET_CTX_RLS - this flag signifies a context free was requested, but we are deferring it due to an XBUSY or ABORT in progress. A ctxlock is added to the context structure that is used whenever these flags are set/read within the context of an IO. The LPFC_NVMET_CTX_RLS flag is only set in the defer_relase routine when the transport has resolved all IO associated with the buffer. The flag is cleared when the CTX is associated with a new IO. An exchange can has both an LPFC_NVMET_XBUSY and a LPFC_NVMET_ABORT_OP condition active simultaneously. Both conditions must complete before the exchange is freed. When the abort callback (lpfc_nvmet_xmt_fcp_abort) is envoked: If there is an outstanding IO, the driver will issue an ABORT_WQE. This should result in 3 completions for the exchange: 1) IO cmpl with XB bit set 2) Abort WQE cmpl 3) XRI_ABORTED_CQE cmpl For this scenerio, after completion #1, the NVMET Transport IO rsp callback is called. After completion #2, no action is taken with respect to the exchange / context. After completion #3, the exchange context is free for re-use on another IO. If there is no outstanding activity on the exchange, the driver will send a ABTS to the Initiator. Upon completion of this WQE, the exchange / context is freed for re-use on another IO. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24Fix max_sgl_segments settings for NVME / NVMETJames Smart1-4/+14
Cannot set NVME segment counts to a large number The existing module parameter lpfc_sg_seg_cnt is used for both SCSI and NVME. Limit the module parameter lpfc_sg_seg_cnt to 128 with the default being 64 for both NVME and NVMET, assuming NVME is enabled in the driver for that port. The driver will set max_sgl_segments in the NVME/NVMET template to lpfc_sg_seg_cnt + 1. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24Fix driver load issues when MRQ=8James Smart1-11/+12
The symptom is that the driver will fail to login to the fabric. The reason is because it is out of iocb resources. There is a one to one relationship between MRQs (receive buffers for NVMET-FC) and iocbs and the default number of IOCBs was not accounting for the number of MRQs that were being created. This fix aligns the number of MRQ resources with the total resources so that it can handle fabric events when needed. Also the initialization of ctxlock to be on FCP commands, NOT LS commands. And modified log messages so that the log output can be correlated with the analyzer trace. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24Fix nvme initiator handling when not enabled.James Smart1-2/+13
Fix nvme initiator handline when CONFIG_LPFC_NVME_INITIATOR is not enabled. With update nvme upstream driver sources, loading the driver with nvme enabled resulting in this Oops. BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: lpfc_nvme_update_localport+0x23/0xd0 [lpfc] PGD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 10256 Comm: lpfc_worker_0 Tainted Hardware name: ... task: ffff881028191c40 task.stack: ffff880ffdf00000 RIP: 0010:lpfc_nvme_update_localport+0x23/0xd0 [lpfc] RSP: 0018:ffff880ffdf03c20 EFLAGS: 00010202 Cause: As the initiator driver completes discovery at different stages, it call lpfc_nvme_update_localport to hint that the DID and role may have changed. In the implementation of lpfc_nvme_update_localport, the driver was not validating the localport or the lport during the execution of the update_localport routine. With the recent upstream additions to the driver, the create_localport routine didn't run and so the localport was NULL causing the page-fault Oops. Fix: Add the CONFIG_LPFC_NVME_INITIATOR preprocessor inclusions to lpfc_nvme_update_localport to turn off all routine processing when the running kernel does not have NVME configured. Add NULL pointer checks on the localport and lport in lpfc_nvme_update_localport and dump messages if they are NULL and just exit. Also one alingment issue fixed. Repalces the ifdef with the IS_ENABLED macro. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24Fix log message in completion path.James Smart1-3/+2
In the lpfc_nvme_io_cmd_wqe_cmpl routine the driver was printing two pointers and the DID for the rport whenever an IO completed on a now that had transitioned to a non active state. There is no need to print the node pointer address for a node that is not active the DID should be enough to debug. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24Fix rejected nvme LS Req.James Smart1-5/+21
In this case, the NVME initiator is sending an LS REQ command on an NDLP that is not MAPPED. The FW rejects it. The lpfc_nvme_ls_req routine checks for a NULL ndlp pointer but does not check the NDLP state. This allows the routine to send an LS IO when the ndlp is disconnected. Check the ndlp for NULL, actual node, Target and MAPPED or Initiator and UNMAPPED. This avoids Fabric nodes getting the Create Association or Create Connection commands. Initiators are free to Reject either Create. Also some of the messages numbers in lpfc_nvme_ls_req were changed because they were already used in other log messages. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24Fix nvme unregister port timeout.James Smart1-3/+5
During some link event testing it was observed that the wait_for_completion_timeout in the lpfc_nvme_unregister_port was timing out all the time. The initiator is claiming the nvme_fc_unregister_remoteport upcall is not completing the unregister in the time allotted. [ 2186.151317] lpfc 0000:07:00.0: 0:(0):6169 Unreg nvme wait failed 0 The wait_for_completion_timeout returns 0 when the wait has been outstanding for the jiffies passed by the caller. In this error message, the nvme initiator passed value 5 - meaning 5 jiffies - and this is just wrong. Calculate 5 seconds in Jiffies and pass that value from the current jiffies. Also the log message for the unregister timeout was reduced because timeout failure is the same as timeout. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-03-15scsi: lpfc: Finalize Kconfig options for nvmeJames Smart1-4/+4
Reviewing the result of what was just added for Kconfig, we made a poor choice. It worked well for full kernel builds, but not so much for how it would be deployed on a distro. Here's the final result: - lpfc will compile in NVME initiator and/or NVME target support based on whether the kernel has the corresponding subsystem support. Kconfig is not used to drive this specifically for lpfc. - There is a module parameter, lpfc_enable_fc4_type, that indicates whether the ports will do FCP-only or FCP & NVME (NVME-only not yet possible due to dependency on fc transport). As FCP & NVME divvys up exchange resources, and given NVME will not be often initially, the default is changed to FCP only. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-06scsi: lpfc: Rework lpfc Kconfig for NVME optionsJames Smart1-3/+15
Reworked Kconfig so that lfpc only requires the scsi stack. NVME Initiator and NVME Target support can be enabled if the other NVMe subsystems have been enabled. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-06scsi: lpfc: add NVME exchange abortsJames Smart1-5/+59
previous code did little more than log a message. This patch adds abort path support, modeled after the SCSI code paths. Currently addresses only the initiator path. Target path under development, but stubbed out. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-06scsi: lpfc: Fix nvme allocation bug on failed nvme_fc_register_localportJames Smart1-2/+2
nvme bufs get allocated even when the registration fails. Move allocation into the rsgistration success path. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-06scsi: lpfc: Fix IO submission if WQ is fullJames Smart1-1/+1
For both initiator and target: if WQ is full, return -EBUSY. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>