summaryrefslogtreecommitdiffstats
path: root/drivers/scsi/lpfc
AgeCommit message (Collapse)AuthorFilesLines
2019-12-14Merge tag 'scsi-fixes' of ↵Linus Torvalds1-6/+9
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "24 fixes, all in drivers. The lion's share (16) are qla2xxx and the rest are iscsi (3), ufs (2), smarpqi, lpfc and libsas" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits) scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func scsi: iscsi: Fix a potential deadlock in the timeout handler scsi: smartpqi: Update attribute name to `driver_version` scsi: libsas: stop discovering if oob mode is disconnected scsi: ufs: Disable autohibern8 feature in Cadence UFS scsi: iscsi: qla4xxx: fix double free in probe scsi: ufs: Give an unique ID to each ufs-bsg scsi: qla2xxx: Add debug dump of LOGO payload and ELS IOCB scsi: qla2xxx: Ignore PORT UPDATE after N2N PLOGI scsi: qla2xxx: Don't defer relogin unconditonally scsi: qla2xxx: Send Notify ACK after N2N PLOGI scsi: qla2xxx: Configure local loop for N2N target scsi: qla2xxx: Fix PLOGI payload and ELS IOCB dump length scsi: qla2xxx: Don't call qlt_async_event twice scsi: qla2xxx: Allow PLOGI in target mode scsi: qla2xxx: Change discovery state before PLOGI scsi: qla2xxx: Drop superfluous INIT_WORK of del_work scsi: qla2xxx: Initialize free_work before flushing it scsi: qla2xxx: Use explicit LOGO in target mode scsi: qla2xxx: Ignore NULL pointer in tcm_qla2xxx_free_mcmd ...
2019-12-13Merge tag 'for-linus-20191212' of git://git.kernel.dk/linux-blockLinus Torvalds1-0/+2
Pull block fixes from Jens Axboe: - stable fix for the bi_size overflow. Not a corruption issue, but a case wher we could merge but disallowed (Andreas) - NVMe pull request via Keith, with various fixes. - MD pull request from Song. - Merge window regression fix for the rq passthrough stats (Logan) - Remove unused blkcg_drain_queue() function (Guoqing) * tag 'for-linus-20191212' of git://git.kernel.dk/linux-block: blk-cgroup: remove blkcg_drain_queue block: fix NULL pointer dereference in account statistics with IDE md: make sure desc_nr less than MD_SB_DISKS md: raid1: check rdev before reference in raid1_sync_request func raid5: need to set STRIPE_HANDLE for batch head block: fix "check bi_size overflow before merge" nvme/pci: Fix read queue count nvme/pci Limit write queue sizes to possible cpus nvme/pci: Fix write and poll queue types nvme/pci: Remove last_cq_head nvme: Namepace identification descriptor list is optional nvme-fc: fix double-free scenarios on hw queues nvme: else following return is not needed nvme: add error message on mismatching controller ids nvme_fc: add module to ops template to allow module references nvmet-loop: Avoid preallocating big SGL for data nvme-fc: Avoid preallocating big SGL for data nvme-rdma: Avoid preallocating big SGL for data
2019-12-09scsi: lpfc: Fix memory leak on lpfc_bsg_write_ebuf_set funcBo Wu1-6/+9
When phba->mbox_ext_buf_ctx.seqNum != phba->mbox_ext_buf_ctx.numBuf, dd_data should be freed before return SLI_CONFIG_HANDLED. When lpfc_sli_issue_mbox func return fails, pmboxq should be also freed in job_error tag. Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E7A966@DGGEML525-MBS.china.huawei.com Signed-off-by: Bo Wu <wubo40@huawei.com> Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-08Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-1/+1
Pull more SCSI updates from James Bottomley: "Eleven patches, all in drivers (no core changes) that are either minor cleanups or small fixes. They were late arriving, but still safe for -rc1" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: MAINTAINERS: Add the linux-scsi mailing list to the ISCSI entry scsi: megaraid_sas: Make poll_aen_lock static scsi: sd_zbc: Improve report zones error printout scsi: qla2xxx: Fix qla2x00_request_irqs() for MSI scsi: qla2xxx: unregister ports after GPN_FT failure scsi: qla2xxx: fix rports not being mark as lost in sync fabric scan scsi: pm80xx: Remove unused include of linux/version.h scsi: pm80xx: fix logic to break out of loop when register value is 2 or 3 scsi: scsi_transport_sas: Fix memory leak when removing devices scsi: lpfc: size cpu map by last cpu id set scsi: ibmvscsi_tgt: Remove unneeded variable rc
2019-12-02Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds22-594/+1998
Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: aacraid, ufs, zfcp, NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx plus a whole load of minor updates and fixes. The major core changes are Al Viro's reworking of sg's handling of copy to/from user, Ming Lei's removal of the host busy counter to avoid contention in the multiqueue case and Damien Le Moal's fixing of residual tracking across error handling" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (251 commits) scsi: bnx2fc: timeout calculation invalid for bnx2fc_eh_abort() scsi: target: core: Fix a pr_debug() argument scsi: iscsi: Don't send data to unbound connection scsi: target: iscsi: Wait for all commands to finish before freeing a session scsi: target: core: Release SPC-2 reservations when closing a session scsi: target: core: Document target_cmd_size_check() scsi: bnx2i: fix potential use after free Revert "scsi: qla2xxx: Fix memory leak when sending I/O fails" scsi: NCR5380: Add disconnect_mask module parameter scsi: NCR5380: Unconditionally clear ICR after do_abort() scsi: NCR5380: Call scsi_set_resid() on command completion scsi: scsi_debug: num_tgts must be >= 0 scsi: lpfc: use hdwq assigned cpu for allocation scsi: arcmsr: fix indentation issues scsi: qla4xxx: fix double free bug scsi: pm80xx: Modified the logic to collect fatal dump scsi: pm80xx: Tie the interrupt name to the module instance scsi: pm80xx: Controller fatal error through sysfs scsi: pm80xx: Do not request 12G sas speeds scsi: pm80xx: Cleanup command when a reset times out ...
2019-11-27nvme_fc: add module to ops template to allow module referencesJames Smart1-0/+2
In nvme-fc: it's possible to have connected active controllers and as no references are taken on the LLDD, the LLDD can be unloaded. The controller would enter a reconnect state and as long as the LLDD resumed within the reconnect timeout, the controller would resume. But if a namespace on the controller is the root device, allowing the driver to unload can be problematic. To reload the driver, it may require new io to the boot device, and as it's no longer connected we get into a catch-22 that eventually fails, and the system locks up. Fix this issue by taking a module reference for every connected controller (which is what the core layer did to the transport module). Reference is cleared when the controller is removed. Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2019-11-21scsi: lpfc: size cpu map by last cpu id setJames Smart1-1/+1
Currently the lpfc driver sizes its cpu_map array based on num_possible_cpus(). However, that can be a value that is less than the highest cpu id bit that is set. As such, if a thread runs on a cpu with a larger cpu id, or for_each_possible_cpu() is used, the driver could index off the end of the array and return garbage or GPF. The driver maintains its own internal copy of the "num_possible" cpu value and sizes arrays by it. Fix by setting the driver's value to the value of the last cpu id bit set in the possible_mask - plus 1. Thus cpu_map will be sized to allow access by any cpu id possible. Link: https://lore.kernel.org/r/20191121175556.18953-1-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-19scsi: lpfc: use hdwq assigned cpu for allocationJames Smart1-2/+2
Looking at the recent conversion from smp_processor_id() to raw_smp_processor_id(), realized that the allocation should be based on the cpu the hdwq is bound to, not the executing cpu. Revise to pull cpu number from the hdwq Fixes: 765ab6cdac3b ("scsi: lpfc: Fix a kernel warning triggered by lpfc_get_sgl_per_hdwq()") Link: https://lore.kernel.org/r/20191116003847.6141-1-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12scsi: lpfc: Update lpfc version to 12.6.0.2James Smart1-1/+1
Update lpfc version to 12.6.0.2 Link: https://lore.kernel.org/r/20191111230401.12958-7-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12scsi: lpfc: revise nvme max queues to be hdwq countJames Smart1-6/+4
Driver is setting the initiator nvme template with a max hw queues value of the present cpu count which is odd. It should be registering the number of hdwq queues (queues created on the adapter). Change to set nvme tempate, in all cases, to the number of hardware queues. Link: https://lore.kernel.org/r/20191111230401.12958-6-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12scsi: lpfc: Initialize cpu_map for not present cpusJames Smart1-1/+18
Currently, cpu_map[cpu#]->hdwq is left to equal LPFC_VECTOR_MAP_EMPTY for not present CPUs. If a CPU is dynamically hot-added, it is possible we may crash due to not assigning an allocated hdwq. Correct by assigning a hdwq at initialization for all not-present CPUs. Fixes: dcaa21367938 ("scsi: lpfc: Change default IRQ model on AMD architectures") Link: https://lore.kernel.org/r/20191111230401.12958-5-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12scsi: lpfc: fix inlining of lpfc_sli4_cleanup_poll_list()James Smart2-2/+2
Compilation can fail due to having an inline function reference where the function body is not present. Fix by removing the inline tag. Fixes: 93a4d6f40198 ("scsi: lpfc: Add registration for CPU Offline/Online events") Link: https://lore.kernel.org/r/20191111230401.12958-4-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12scsi: lpfc: fix: Coverity: lpfc_cmpl_els_rsp(): Null pointer dereferencesJames Smart1-1/+1
Coverity reported the following: *** CID 101747: Null pointer dereferences (FORWARD_NULL) /drivers/scsi/lpfc/lpfc_els.c: 4439 in lpfc_cmpl_els_rsp() 4433 kfree(mp); 4434 } 4435 mempool_free(mbox, phba->mbox_mem_pool); 4436 } 4437 out: 4438 if (ndlp && NLP_CHK_NODE_ACT(ndlp)) { vvv CID 101747: Null pointer dereferences (FORWARD_NULL) vvv Dereferencing null pointer "shost". 4439 spin_lock_irq(shost->host_lock); 4440 ndlp->nlp_flag &= ~(NLP_ACC_REGLOGIN | NLP_RM_DFLT_RPI); 4441 spin_unlock_irq(shost->host_lock); 4442 4443 /* If the node is not being used by another discovery thread, 4444 * and we are sending a reject, we are done with it. Fix by adding a check for non-null shost in line 4438. The scenario when shost is set to null is when ndlp is null. As such, the ndlp check present was sufficient. But better safe than sorry so add the shost check. Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 101747 ("Null pointer dereferences") Fixes: 2e0fef85e098 ("[SCSI] lpfc: NPIV: split ports") CC: James Bottomley <James.Bottomley@SteelEye.com> CC: "Gustavo A. R. Silva" <gustavo@embeddedor.com> CC: linux-next@vger.kernel.org Link: https://lore.kernel.org/r/20191111230401.12958-3-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12scsi: lpfc: fix: Coverity: lpfc_get_scsi_buf_s3(): Null pointer dereferencesJames Smart1-1/+1
Coverity reported the following: *** CID 1487391: Null pointer dereferences (FORWARD_NULL) /drivers/scsi/lpfc/lpfc_scsi.c: 614 in lpfc_get_scsi_buf_s3() 608 spin_unlock(&phba->scsi_buf_list_put_lock); 609 } 610 spin_unlock_irqrestore(&phba->scsi_buf_list_get_lock, iflag); 611 612 if (lpfc_ndlp_check_qdepth(phba, ndlp)) { 613 atomic_inc(&ndlp->cmd_pending); vvv CID 1487391: Null pointer dereferences (FORWARD_NULL) vvv Dereferencing null pointer "lpfc_cmd". 614 lpfc_cmd->flags |= LPFC_SBUF_BUMP_QDEPTH; 615 } 616 return lpfc_cmd; 617 } 618 /** 619 * lpfc_get_scsi_buf_s4 - Get a scsi buffer from io_buf_list of the HBA Fix by checking lpfc_cmd to be non-NULL as part of line 612 Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 1487391 ("Null pointer dereferences") Fixes: 2a5b7d626ed2 ("scsi: lpfc: Limit tracking of tgt queue depth in fast path") CC: "Martin K. Petersen" <martin.petersen@oracle.com> CC: "Gustavo A. R. Silva" <gustavo@embeddedor.com> CC: linux-next@vger.kernel.org Link: https://lore.kernel.org/r/20191111230401.12958-2-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08scsi: lpfc: Fix lpfc_cpumask_of_node_init()Bart Van Assche1-9/+3
Fix the following kernel warning: cpumask_of_node(-1): (unsigned)node >= nr_node_ids(1) Fixes: dcaa21367938 ("scsi: lpfc: Change default IRQ model on AMD architectures") Link: https://lore.kernel.org/r/20191108225947.1395-1-jsmart2021@gmail.com Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08scsi: lpfc: Fix a kernel warning triggered by lpfc_sli4_enable_intr()Bart Van Assche1-2/+0
Fix the following lockdep warning: ============================================ WARNING: possible recursive locking detected 5.4.0-rc6-dbg+ #2 Not tainted -------------------------------------------- systemd-udevd/130 is trying to acquire lock: ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: irq_calc_affinity_vectors+0x63/0x90 but task is already holding lock: ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: lpfc_sli4_enable_intr+0x422/0xd50 [lpfc] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(cpu_hotplug_lock.rw_sem); lock(cpu_hotplug_lock.rw_sem); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by systemd-udevd/130: #0: ffff8880d53fe210 (&dev->mutex){....}, at: __device_driver_lock+0x4a/0x70 #1: ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: lpfc_sli4_enable_intr+0x422/0xd50 [lpfc] stack backtrace: CPU: 1 PID: 130 Comm: systemd-udevd Not tainted 5.4.0-rc6-dbg+ #2 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: dump_stack+0xa5/0xe6 __lock_acquire.cold+0xf7/0x23a lock_acquire+0x106/0x240 cpus_read_lock+0x41/0xe0 irq_calc_affinity_vectors+0x63/0x90 __pci_enable_msix_range+0x10a/0x950 pci_alloc_irq_vectors_affinity+0x144/0x210 lpfc_sli4_enable_intr+0x4b2/0xd50 [lpfc] lpfc_pci_probe_one+0x1411/0x22b0 [lpfc] local_pci_probe+0x7c/0xc0 pci_device_probe+0x25d/0x390 really_probe+0x170/0x510 driver_probe_device+0x127/0x190 device_driver_attach+0x98/0xa0 __driver_attach+0xb6/0x1a0 bus_for_each_dev+0x100/0x150 driver_attach+0x31/0x40 bus_add_driver+0x246/0x300 driver_register+0xe0/0x170 __pci_register_driver+0xde/0xf0 lpfc_init+0x134/0x1000 [lpfc] do_one_initcall+0xda/0x47e do_init_module+0x10a/0x3b0 load_module+0x4318/0x47c0 __do_sys_finit_module+0x134/0x1d0 __x64_sys_finit_module+0x47/0x50 do_syscall_64+0x6f/0x2e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: dcaa21367938 ("scsi: lpfc: Change default IRQ model on AMD architectures") Link: https://lore.kernel.org/r/20191107052158.25788-4-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08scsi: lpfc: Fix a kernel warning triggered by lpfc_get_sgl_per_hdwq()Bart Van Assche1-2/+2
Fix the following kernel bug report: BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/954 Fixes: d79c9e9d4b3d ("scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.") Link: https://lore.kernel.org/r/20191107052158.25788-2-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Update lpfc version to 12.6.0.1James Smart1-1/+1
Update lpfc version to 12.6.0.1 Link: https://lore.kernel.org/r/20191105005708.7399-12-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Add enablement of multiple adapter dumpsJames Smart2-1/+36
Some adapters support the ability to hold multiple adapter dumps on the adapter flash. Some adapters default to enabling this feature while others default to single-dump. Make support uniform by enabling dual dump by default. Link: https://lore.kernel.org/r/20191105005708.7399-11-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Change default IRQ model on AMD architecturesJames Smart4-138/+484
The current driver attempts to allocate an interrupt vector per cpu using the systems managed IRQ allocator (flag PCI_IRQ_AFFINITY). The system IRQ allocator will either provide the per-cpu vector, or return fewer vectors. When fewer vectors, they are evenly spread between the numa nodes on the system. When run on an AMD architecture, if interrupts occur to a cpu that is not in the same numa node as the adapter generating the interrupt, there are extreme costs and overheads in performance. Thus, if 1:1 vector allocation is used, or the "balanced" vectors in the other numa nodes, performance can be hit significantly. A much more performant model is to allocate interrupts only on the cpus that are in the numa node where the adapter resides. I/O completion is still performed by the cpu where the I/O was generated. Unfortunately, there is no flag to request the managed IRQ subsystem allocate vectors only for the CPUs in the numa node as the adapter. On AMD architecture, revert the irq allocation to the normal style (non-managed) and then use irq_set_affinity_hint() to set the cpu affinity and disable user-space rebalancing. Tie the support into CPU offline/online. If the cpu being offlined owns a vector, the vector is re-affinitized to one of the other CPUs on the same numa node. If there are no more CPUs on the numa node, the vector has all affinity removed and lets the system determine where it's serviced. Similarly, when the cpu that owned a vector comes online, the vector is reaffinitized to the cpu. Link: https://lore.kernel.org/r/20191105005708.7399-10-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Add registration for CPU Offline/Online eventsJames Smart5-12/+388
The recent affinitization didn't address cpu offlining/onlining. If an interrupt vector is shared and the low order cpu owning the vector is offlined, as interrupts are managed, the vector is taken offline. This causes the other CPUs sharing the vector will hang as they can't get io completions. Correct by registering callbacks with the system for Offline/Online events. When a cpu is taken offline, its eq, which is tied to an interrupt vector is found. If the cpu is the "owner" of the vector and if the eq/vector is shared by other CPUs, the eq is placed into a polled mode. Additionally, code paths that perform io submission on the "sharing CPUs" will check the eq state and poll for completion after submission of new io to a wq that uses the eq. Similarly, when a cpu comes back online and owns an offlined vector, the eq is taken out of polled mode and rearmed to start driving interrupts for eq. Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Clarify FAWNN error messageJames Smart1-2/+2
Current message on FAWWN events is rather cryptic. Expand the message to clarify its meaning. Link: https://lore.kernel.org/r/20191105005708.7399-8-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Sync with FC-NVMe-2 SLER change to require Conf with SLERJames Smart1-1/+3
Prior to the last FC-NVME-2 draft, SLER and CONF were independent. SLER now requires CONF to be set. Revise the NVME PRLI checking to look for both inorder to enable SLER. Link: https://lore.kernel.org/r/20191105005708.7399-7-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Fix dynamic fw log enablement checkJames Smart1-1/+1
The recently posted patch had a typo that incorrectly tested the receiving function. Fix the typo (change == to !=) Fixes: 95bfc6d8ad86 ("scsi: lpfc: Make FW logging dynamically configurable") Link: https://lore.kernel.org/r/20191105005708.7399-6-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Fix unexpected error messages during RSCN handlingJames Smart1-2/+19
During heavy RCN activity and log_verbose = 0 we see these messages: 2754 PRLI failure DID:521245 Status:x9/xb2c00, data: x0 0231 RSCN timeout Data: x0 x3 0230 Unexpected timeout, hba link state x5 This is due to delayed RSCN activity. Correct by avoiding the timeout thus the messages by restarting the discovery timeout whenever an rscn is received. Filter PRLI responses such that severity depends on whether expected for the configuration or not. For example, PRLI errors on a fabric will be informational (they are expected), but Point-to-Point errors are not necessarily expected so they are raised to an error level. Link: https://lore.kernel.org/r/20191105005708.7399-5-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Fix kernel crash at lpfc_nvme_info_show during remote port bounceJames Smart1-20/+20
When reading sysfs nvme_info file while a remote port leaves and comes back, a NULL pointer is encountered. The issue is due to ndlp list corruption as the the nvme_info_show does not use the same lock as the rest of the code. Correct by removing the rcu_xxx_lock calls and replace by the host_lock and phba->hbaLock spinlocks that are used by the rest of the driver. Given we're called from sysfs, we are safe to use _irq rather than _irqsave. Link: https://lore.kernel.org/r/20191105005708.7399-4-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Fix configuration of BB credit recovery in service parametersJames Smart1-10/+3
The driver today is reading service parameters from the firmware and then overwriting the firmware-provided values with values of its own. There are some switch features that require preliminary FLOGI's that are switch-specific and done prior to the actual fabric FLOGI for traffic. The fw will perform those FLOGIs and will revise the service parameters for the features configured. As the driver later overwrites those values with its own values, it misconfigures things like BBSCN use by doing so. Correct by eliminating the driver-overwrite of firmware values. The driver correctly re-reads the service parameters after each link up to obtain the latest values from firmware. Link: https://lore.kernel.org/r/20191105005708.7399-3-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Fix duplicate unreg_rpi error in port offline flowJames Smart1-0/+7
If the driver receives a login that is later then LOGO'd by the remote port (aka ndlp), the driver, upon the completion of the LOGO ACC transmission, will logout the node and unregister the rpi that is being used for the node. As part of the unreg, the node's rpi value is replaced by the LPFC_RPI_ALLOC_ERROR value. If the port is subsequently offlined, the offline walks the nodes and ensures they are logged out, which possibly entails unreg'ing their rpi values. This path does not validate the node's rpi value, thus doesn't detect that it has been unreg'd already. The replaced rpi value is then used when accessing the rpi bitmask array which tracks active rpi values. As the LPFC_RPI_ALLOC_ERROR value is not a valid index for the bitmask, it may fault the system. Revise the rpi release code to detect when the rpi value is the replaced RPI_ALLOC_ERROR value and ignore further release steps. Link: https://lore.kernel.org/r/20191105005708.7399-2-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-02Merge tag 'scsi-fixes' of ↵Linus Torvalds2-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Nine changes, eight in drivers [ufs, target, lpfc x 2, qla2xxx x 4] and one core change in sd that fixes an I/O failure on DIF type 3 devices" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: stop timer in shutdown path scsi: sd: define variable dif as unsigned int instead of bool scsi: target: cxgbit: Fix cxgbit_fw4_ack() scsi: qla2xxx: Fix partial flash write of MBI scsi: qla2xxx: Initialized mailbox to prevent driver load failure scsi: lpfc: Honor module parameter lpfc_use_adisc scsi: ufs-bsg: Wake the device before sending raw upiu commands scsi: lpfc: Check queue pointer before use scsi: qla2xxx: fixup incorrect usage of host_byte
2019-10-28scsi: lpfc: Make lpfc_debugfs_ras_log_data staticYueHaibing1-2/+2
Fix sparse warning: drivers/scsi/lpfc/lpfc_debugfs.c:2083:1: warning: symbol 'lpfc_debugfs_ras_log_data' was not declared. Should it be static? Link: https://lore.kernel.org/r/20191028132556.16272-1-yuehaibing@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28scsi: lpfc: Fix NULL check before mempool_destroy is not neededSaurav Girepunje1-2/+1
mempool_destroy has taken null pointer check into account. Remove the redundant check. Link: https://lore.kernel.org/r/20191026194712.GA22249@saurav Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28scsi: lpfc: fix spelling error in MAGIC_NUMER_xxxJames Smart2-4/+4
convert MAGIC_NUMER_xxx to MAGIC_NUMBER_xxx Link: https://lore.kernel.org/r/20191025184342.6623-1-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28scsi: lpfc: fix build error of lpfc_debugfs.c for vfree/vmallocJames Smart1-0/+1
lpfc_debufs.c was missing include of vmalloc.h when compiled on PPC. Add missing header. Link: https://lore.kernel.org/r/20191025182530.26653-1-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-25Merge tag 'scsi-fixes' of ↵Linus Torvalds2-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Nine changes, eight to drivers (qla2xxx, hpsa, lpfc, alua, ch, 53c710[x2], target) and one core change that tries to close a race between sysfs delete and module removal" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: lpfc: remove left-over BUILD_NVME defines scsi: core: try to get module before removing device scsi: hpsa: add missing hunks in reset-patch scsi: target: core: Do not overwrite CDB byte 1 scsi: ch: Make it possible to open a ch device multiple times again scsi: fix kconfig dependency warning related to 53C700_LE_ON_BE scsi: sni_53c710: fix compilation error scsi: scsi_dh_alua: handle RTPG sense code correctly during state transitions scsi: qla2xxx: fix a potential NULL pointer dereference
2019-10-24scsi: lpfc: lpfc_nvmet: Fix Use plain integer as NULL pointerSaurav Girepunje1-2/+2
Replace assignment of 0 to pointer with NULL assignment. Link: https://lore.kernel.org/r/20191024030857.GA12097@saurav Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com> Acked-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: lpfc_attr: Fix Use plain integer as NULL pointerSaurav Girepunje1-1/+1
Replace assignment of 0 to pointer with NULL assignment. Link: https://lore.kernel.org/r/20191024025726.GA31421@saurav Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com> Acked-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Update lpfc version to 12.6.0.0James Smart1-1/+1
Update lpfc version to 12.6.0.0 Link: https://lore.kernel.org/r/20191018211832.7917-17-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Add additional discovery log messagesJames Smart3-9/+62
When debugging a recent discovery customer problem it was very hard to tell what was happening with the existing discovery log messages. To fully debug the issue additional log messages were necessary. Add or extend log messages so that sufficient information is present for debugging. Link: https://lore.kernel.org/r/20191018211832.7917-16-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Add FC-AL support to lpe32000 modelsJames Smart6-1/+142
In the past, the lpe32000 models, based their main support being for 32G, and as FC-AL is not supported in the FC standards past 8G, did not support FC-AL operation. This patch adds private-loop FC-AL support for the LPE32000 adapters when a link is 8G or below. To avoid conditions where link rate may change, which would cause non-connectivity to the AL device, FC-AL mode must become a persistent setting and the link kept at a speed supporting FC-AL. The patch: - Adds a pls attribute indicating whether the adapter properly supports FC-AL. - Adds support for the adapter to indicate that topology should be fixed and the topology types to be configured. - Adds a pt attribute to report the persistent topology if present. Link: https://lore.kernel.org/r/20191018211832.7917-15-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Add FA-WWN Async Event reportingJames Smart2-0/+11
Add decode support for adapter Async Events which report FA-WWN configuration errors. Link: https://lore.kernel.org/r/20191018211832.7917-14-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Add log macros to allow print by serverity or verbosity settingJames Smart1-0/+17
Add two new macros to aid in message logging: Both macros print a message if the corresponding lpfc verbosity setting is set or the kernel log level is WARNING or more critical. One macro is for use with a phba structure, the other with a vport structure. [mkp: typo] Link: https://lore.kernel.org/r/20191018211832.7917-13-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Make FW logging dynamically configurableJames Smart5-10/+204
Currently, the FW logging facility is a load/boot time parameter which requires the driver to be unloaded/reloaded or the system rebooted in order to change its configuration. Convert the logging facility to allow dynamic enablement and configuration. Specifically: - Convert the feature so that it can be enabled dynamically via an attribute. Additionally, the size of the buffer can be configured dynamically. - Add locks around states that now may be changing. - Tie the feature into debugfs so that the logs can be read at any time. Link: https://lore.kernel.org/r/20191018211832.7917-12-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Revise interrupt coalescing for missing scenariosJames Smart4-32/+24
The existing "auto eq delay" mechanism was sometimes skipping over an EQ, not ramping the coalescing down under light load fast enough, and in other cases never kicked in as cpu sharing by multiple vectors didn't quite add up right. Tweak the interrupt mechanism such that: - Add a flag to the EQ to force checking for colaescing values when being serviced in the interrupt handler. The flag will be set by any CQ bound to the EQ whenever the number of CQ elements process in a single scan meets or exceeds the hardware queue notify level. E.g. there's a significant number of completions happening. - In the heartbeat work item that checks coalescing: - Replace the structure that was counting the number of EQs that interrupted on a single cpu with a new structure that looks at the EQ to see whether EQ currently has a coalescing value (thus it should be re-evaluate) or was marked by the new flag indicating heavy completions. - When a cpu, which may be servicing multiple vectors, had at least 1 EQ that should be checked, a new coalescing delay is calculated based on the number of interrupts that occurred on the cpu. - The new coalescing value is then applied to the EQs that had interrupted on the cpu. Link: https://lore.kernel.org/r/20191018211832.7917-11-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Remove lock contention target write pathJames Smart5-54/+14
Lower IOps performance with write operations. Perf tool shows lock contention in dma_pool_alloc and dma_pool_free related to the txrdy_payload_pool. The allocations are for dma buffers for XFER_RDY's, which actually are not needed for the FCP_TRECEIVE command as the command contents are used by the adapter to generate the IU. Remove the allocations and the associated buffer pool. Rather than leaving NULLs in buffer pointer locations, set command and sgl to indicate skipped SGLE indexes. Link: https://lore.kernel.org/r/20191018211832.7917-10-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Slight fast-path performance optimizationsJames Smart2-9/+10
Slightly rework some error check code paths for better streamlining. Added compiler unlikely hints to allow slightly better optimization of the fast-path. Removed a few pointer checks that were obviously already valid. Link: https://lore.kernel.org/r/20191018211832.7917-9-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: fix coverity error of dereference after null checkJames Smart1-2/+2
Log message conditional upon vport being NULL dereferences vport to determine log verbose setting. Changed to use lpfc_print_log which uses phba to determine the active log verbose setting. Fixes: 43bfea1bffb6 ("scsi: lpfc: Fix coverity errors on NULL pointer checks") Link: https://lore.kernel.org/r/20191018211832.7917-8-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Fix hardlockup in lpfc_abort_handlerJames Smart1-2/+3
In lpfc_abort_handler, the lock acquire order is hbalock (irqsave), buf_lock (irq) and ring_lock (irq). The issue is that in two places the locks are released out of order - the buf_lock and the hbalock - resulting in the cpu preemption/lock flags getting restored out of order and deadlocking the cpu. Fix the unlock order by fully releasing the hbalocks as well. CC: Zhangguanghui <zhang.guanghui@h3c.com> Link: https://lore.kernel.org/r/20191018211832.7917-7-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Fix bad ndlp ptr in xri aborted handlingJames Smart3-7/+12
In cases where I/O may be aborted, such as driver unload or link bounces, the system will crash based on a bad ndlp pointer. Example: RIP: 0010:lpfc_sli4_abts_err_handler+0x15/0x140 [lpfc] ... lpfc_sli4_io_xri_aborted+0x20d/0x270 [lpfc] lpfc_sli4_sp_handle_abort_xri_wcqe.isra.54+0x84/0x170 [lpfc] lpfc_sli4_fp_handle_cqe+0xc2/0x480 [lpfc] __lpfc_sli4_process_cq+0xc6/0x230 [lpfc] __lpfc_sli4_hba_process_cq+0x29/0xc0 [lpfc] process_one_work+0x14c/0x390 Crash was caused by a bad ndlp address passed to I/O indicated by the XRI aborted CQE. The address was not NULL so the routine deferenced the ndlp ptr. The bad ndlp also caused the lpfc_sli4_io_xri_aborted to call an erroneous io handler. Root cause for the bad ndlp was an lpfc_ncmd that was aborted, put on the abort_io list, completed, taken off the abort_io list, sent to lpfc_release_nvme_buf where it was put back on the abort_io list because the lpfc_ncmd->flags setting LPFC_SBUF_XBUSY was not cleared on the final completion. Rework the exchange busy handling to ensure the flags are properly set for both scsi and nvme. Fixes: c490850a0947 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing") Cc: <stable@vger.kernel.org> # v5.1+ Link: https://lore.kernel.org/r/20191018211832.7917-6-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Fix SLI3 hba in loop mode not discovering devicesJames Smart1-1/+3
When operating in private loop mode, PLOGI exchanges are racing and the driver tries to abort it's PLOGI. But the PLOGI abort ends up terminating the login with the other end causing the other end to abort its PLOGI as well. Discovery never fully completes. Fix by disabling the PLOGI abort when private loop and letting the state machine play out. Link: https://lore.kernel.org/r/20191018211832.7917-5-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24scsi: lpfc: Fix lockdep errors in sli_ringtx_putJames Smart1-3/+7
Fix lockdep error in __lpfc_sli_ringtx_put(): The hbalock is valid for sli3, but not for sli4. Change lockdep to look at ring lock if sli4. Also update comment in __lpfc_sli_issue_iocb_s4() to reflect proper lock. Note: lockdep check is already correct. Link: https://lore.kernel.org/r/20191018211832.7917-4-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>