summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband
AgeCommit message (Collapse)AuthorFilesLines
2016-10-09Merge tag 'for-linus-2' of ↵Linus Torvalds17-121/+190
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull more rdma updates from Doug Ledford: "Minor updates for rxe driver" [ Starting to do merge window pulls again - the current -git tree does appear to have some netfilter use-after-free issues, but I've sent off the report to the proper channels, and I don't want to delay merge window activity any more ] * tag 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/rxe: improved debug prints & code cleanup rdma_rxe: Ensure rdma_rxe init occurs at correct time IB/rxe: Properly honor max IRD value for rd/atomic. IB/{rxe,core,rdmavt}: Fix kernel crash for reg MR IB/rxe: Fix sending out loopback packet on netdev interface. IB/rxe: Avoid scheduling tasklet for userspace QP
2016-10-06IB/rxe: improved debug prints & code cleanupParav Pandit14-88/+109
1. Debugging qp state transitions and qp errors in loopback and multiple QP tests is difficult without qp numbers in debug logs. This patch adds qp number to important debug logs. 2. Instead of having rxe: prefix in few logs and not having in few logs, using uniform module name prefix using pr_fmt macro. 3. Code cleanup for various warnings reported by checkpatch for incomplete unsigned data type, line over 80 characters, return statements. Signed-off-by: Parav Pandit <pandit.parav@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-06rdma_rxe: Ensure rdma_rxe init occurs at correct timeStephen Bates1-1/+1
There is a problem when CONFIG_RDMA_RXE=y and CONFIG_IPV6=y. This results in the rdma_rxe initialization occurring before the IPv6 services are ready. This patch delays the initialization of rdma_rxe until after the IPv6 services are ready. This fix is based on one proposed by Logan Gunthorpe on a much older code base. Signed-off-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-06IB/rxe: Properly honor max IRD value for rd/atomic.Parav Pandit3-13/+15
This patch honoris the max incoming read request count instead of outgoing read req count (a) during modify qp by allocating response queue metadata (b) during incoming read request processing Signed-off-by: Parav Pandit <pandit.parav@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-06IB/{rxe,core,rdmavt}: Fix kernel crash for reg MRParav Pandit2-0/+34
This patch fixes below kernel crash on memory registration for rxe and other transport drivers which has dma_ops extension. IB/core invokes ib_map_sg_attrs() in generic manner with dma attributes which is used by mlx5 and mthca adapters. However in doing so it ignored honoring dma_ops extension of software based transports for sg map/unmap operation. This results in calling dma_map_sg_attrs of hardware virtual device resulting in crash for null reference. We extend the core to support sg_map/unmap_attrs and transport drivers to implement those dma_ops callback functions. Verified usign perftest applications. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81032a75>] check_addr+0x35/0x60 ... Call Trace: [<ffffffff81032b39>] ? nommu_map_sg+0x99/0xd0 [<ffffffffa02b31c6>] ib_umem_get+0x3d6/0x470 [ib_core] [<ffffffffa01cc329>] rxe_mem_init_user+0x49/0x270 [rdma_rxe] [<ffffffffa01c793a>] ? rxe_add_index+0xca/0x100 [rdma_rxe] [<ffffffffa01c995f>] rxe_reg_user_mr+0x9f/0x130 [rdma_rxe] [<ffffffffa00419fe>] ib_uverbs_reg_mr+0x14e/0x2c0 [ib_uverbs] [<ffffffffa003d3ab>] ib_uverbs_write+0x15b/0x3b0 [ib_uverbs] [<ffffffff811e92a6>] ? mem_cgroup_commit_charge+0x76/0xe0 [<ffffffff811af0a9>] ? page_add_new_anon_rmap+0x89/0xc0 [<ffffffff8117e6c9>] ? lru_cache_add_active_or_unevictable+0x39/0xc0 [<ffffffff811f0da8>] __vfs_write+0x28/0x120 [<ffffffff811f1239>] ? rw_verify_area+0x49/0xb0 [<ffffffff811f1492>] vfs_write+0xb2/0x1b0 [<ffffffff811f27d6>] SyS_write+0x46/0xa0 [<ffffffff814f7d32>] entry_SYSCALL_64_fastpath+0x1a/0xa4 Signed-off-by: Parav Pandit <pandit.parav@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-06IB/rxe: Fix sending out loopback packet on netdev interface.Parav Pandit1-6/+6
Both prepare4 and prepare6 sets loopback mask in pkt_info structure instance of skb. The xmit_packet and other requester side functions use a pkt_info struct from the stack without the proper mask. This results in sending out the packet to the actual netdev device and loopback functionality is broken. Modify prepare() to pass its correctly marked pkt_info struct to prepare4() and prepare6() instead of them using SKB_TO_PKT(skb) and getting an incorrectly set mask. Verified with perftest applications. Signed-off-by: Parav Pandit <pandit.parav@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-06IB/rxe: Avoid scheduling tasklet for userspace QPParav Pandit1-13/+25
This patch avoids scheduing tasklet for WQE and protocol processing for user space QP. It performs the task in calling process context. To improve code readability kernel specific post_send handling moved to post_send_kernel() function. Signed-off-by: Parav Pandit <pandit.parav@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds10-450/+351
Pull networking updates from David Miller: 1) BBR TCP congestion control, from Neal Cardwell, Yuchung Cheng and co. at Google. https://lwn.net/Articles/701165/ 2) Do TCP Small Queues for retransmits, from Eric Dumazet. 3) Support collect_md mode for all IPV4 and IPV6 tunnels, from Alexei Starovoitov. 4) Allow cls_flower to classify packets in ip tunnels, from Amir Vadai. 5) Support DSA tagging in older mv88e6xxx switches, from Andrew Lunn. 6) Support GMAC protocol in iwlwifi mwm, from Ayala Beker. 7) Support ndo_poll_controller in mlx5, from Calvin Owens. 8) Move VRF processing to an output hook and allow l3mdev to be loopback, from David Ahern. 9) Support SOCK_DESTROY for UDP sockets. Also from David Ahern. 10) Congestion control in RXRPC, from David Howells. 11) Support geneve RX offload in ixgbe, from Emil Tantilov. 12) When hitting pressure for new incoming TCP data SKBs, perform a partial rathern than a full purge of the OFO queue (which could be huge). From Eric Dumazet. 13) Convert XFRM state and policy lookups to RCU, from Florian Westphal. 14) Support RX network flow classification to igb, from Gangfeng Huang. 15) Hardware offloading of eBPF in nfp driver, from Jakub Kicinski. 16) New skbmod packet action, from Jamal Hadi Salim. 17) Remove some inefficiencies in snmp proc output, from Jia He. 18) Add FIB notifications to properly propagate route changes to hardware which is doing forwarding offloading. From Jiri Pirko. 19) New dsa driver for qca8xxx chips, from John Crispin. 20) Implement RFC7559 ipv6 router solicitation backoff, from Maciej Żenczykowski. 21) Add L3 mode to ipvlan, from Mahesh Bandewar. 22) Support 802.1ad in mlx4, from Moshe Shemesh. 23) Support hardware LRO in mediatek driver, from Nelson Chang. 24) Add TC offloading to mlx5, from Or Gerlitz. 25) Convert various drivers to ethtool ksettings interfaces, from Philippe Reynes. 26) TX max rate limiting for cxgb4, from Rahul Lakkireddy. 27) NAPI support for ath10k, from Rajkumar Manoharan. 28) Support XDP in mlx5, from Rana Shahout and Saeed Mahameed. 29) UDP replicast support in TIPC, from Richard Alpe. 30) Per-queue statistics for qed driver, from Sudarsana Reddy Kalluru. 31) Support BQL in thunderx driver, from Sunil Goutham. 32) TSO support in alx driver, from Tobias Regnery. 33) Add stream parser engine and use it in kcm. 34) Support async DHCP replies in ipconfig module, from Uwe Kleine-König. 35) DSA port fast aging for mv88e6xxx driver, from Vivien Didelot. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1715 commits) mlxsw: switchx2: Fix misuse of hard_header_len mlxsw: spectrum: Fix misuse of hard_header_len net/faraday: Stop NCSI device on shutdown net/ncsi: Introduce ncsi_stop_dev() net/ncsi: Rework the channel monitoring net/ncsi: Allow to extend NCSI request properties net/ncsi: Rework request index allocation net/ncsi: Don't probe on the reserved channel ID (0x1f) net/ncsi: Introduce NCSI_RESERVED_CHANNEL net/ncsi: Avoid unused-value build warning from ia64-linux-gcc net: Add netdev all_adj_list refcnt propagation to fix panic net: phy: Add Edge-rate driver for Microsemi PHYs. vmxnet3: Wake queue from reset work i40e: avoid NULL pointer dereference and recursive errors on early PCI error qed: Add RoCE ll2 & GSI support qed: Add support for memory registeration verbs qed: Add support for QP verbs qed: PD,PKEY and CQ verb support qed: Add support for RoCE hw init qede: Add qedr framework ...
2016-10-04Merge tag 'for-linus' of ↵Linus Torvalds45-899/+1433
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull hdi1 rdma driver updates from Doug Ledford: "This is the first pull request of the 4.9 merge window for the RDMA subsystem. It is only the hfi1 driver. It had dependencies on code that only landed late in the 4.7-rc cycle (around 4.7-rc7), so putting this with my other for-next code would have create an ugly merge of lot of 4.7-rc stuff. For that reason, it's being submitted individually. It's been through 0day and linux-next" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits) IB/rdmavt: Trivial function comment corrected. IB/hfi1: Fix trace of atomic ack IB/hfi1: Update SMA ingress checks for response packets IB/hfi1: Use EPROM platform configuration read IB/hfi1: Add ability to read platform config from the EPROM IB/hfi1: Restore EPROM read ability IB/hfi1: Document new sysfs entries for hfi1 driver IB/hfi1: Add new debugfs sdma_cpu_list file IB/hfi1: Add irq affinity notification handler IB/hfi1: Add a new VL sysfs attribute for sdma engines IB/hfi1: Add sysfs interface for affinity setup IB/hfi1: Fix resource release in context allocation IB/hfi1: Remove unused variable from devdata IB/hfi1: Cleanup tasklet refs in comments IB/hfi1: Adjust hardware buffering parameter IB/hfi1: Act on external device timeout IB/hfi1: Fix defered ack race with qp destroy IB/hfi1: Combine shift copy and byte copy for SGE reads IB/hfi1: Do not read more than a SGE length IB/hfi1: Extend i2c timeout ...
2016-10-03IB/rdmavt: Trivial function comment corrected.Parav Pandit1-1/+1
Corrected function name in comment from qib_ to rvt_. Signed-off-by: Parav Pandit <pandit.parav@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Fix trace of atomic ackMike Marciniszyn2-3/+3
The length is incorrect, causing the trace data to be truncated. Add the additional 8 bytes that should have been there. Also trace out the atomic ack in hex to aid debugging. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Update SMA ingress checks for response packetsJianxin Xiong1-25/+24
Fix "unsupported method" error by skipping ingress pkey checks on response SMA packets. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Use EPROM platform configuration readDean Luick2-15/+26
The driver will now try to read directly from the EPROM as its first choice for the platform configuration file. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Add ability to read platform config from the EPROMDean Luick2-0/+84
Add a function to read the platform configuration file from the EPROM. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Restore EPROM read abilityDean Luick3-3/+103
Partially revert commit d07903174202 ("IB/hfi1: Remove EPROM functionality from data device"), bringing back the ability to read from the EPROM. This code will be used for driver-only acccess to the EPROM, hence change EPROM read to save to a buffer instead of copy touser. Also allow any offset and remove missed includes and leftover declarations. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Add new debugfs sdma_cpu_list fileTadeusz Struk3-0/+83
Add a debugfs sdma_cpu_list file that can be used to examine the CPU to sdma engine assignments for the whole device. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Add irq affinity notification handlerTadeusz Struk2-10/+103
This patch adds an irq affinity notification handler. When a user changes interrupt affinity settings for an sdma engine, the driver needs to make changes to its internal sde structures and also update the affinity_hint. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Add a new VL sysfs attribute for sdma enginesTadeusz Struk1-1/+14
This patch adds a read-only "VL" attribute for the sysfs entry of each sdma engine. It will allow the user to check VL to sdma engine mappings. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Add sysfs interface for affinity setupTadeusz Struk5-7/+412
Some users want more control over which cpu cores are being used by the driver. For example, users might want to restrict the driver to some specified subset of the cores so that they can appropriately partition processes, irq handlers, and work threads. To allow the user to fine tune system affinity settings new sysfs attributes are introduced per sdma engine. This patch adds a new attribute type for sdma engine and a new cpu_list attribute. When the user writes a cpu range to the cpu_list attribute the driver will create an internal cpu->sdma map, which will be used later as a look-up table to choose an optimal engine for a user requests. Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Fix resource release in context allocationJakub Pawlak2-5/+13
Correct resource free in allocate_ctxt() function. When context creation fails allocated resources are properly released and pointer in receive context data table is set back to NULL. Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Remove unused variable from devdataDennis Dalessandro1-2/+0
We no longer use an error tasklet. Remove it from the hfi1_devdata structure. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Cleanup tasklet refs in commentsDennis Dalessandro2-10/+10
The code no longer uses tasklets for the send engine. However it does use a tasklet for sdma but the send routines use a workqueue now days. Update the comments to reflect that. Make things more generic with saying "send engine" because that is what is being referred to. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Adjust hardware buffering parameterHarish Chegondi2-3/+3
It was determined that 0x880 is a better value for hardware buffering, use it. Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Act on external device timeoutDean Luick2-2/+6
Add missing external device timeout notification. Recognize it as a failed LNI signal from the 8051 firmware. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Fix defered ack race with qp destroyMike Marciniszyn1-1/+4
There is a a bug in defered ack stuff that causes a race with the destroy of a QP. A packet causes a defered ack to be pended by putting the QP into an rcd queue. A return from the driver interrupt processing will process that rcd queue of QPs and attempt to do a direct send of the ack. At this point no locks are held and the above QP could now be put in the reset state in the qp destroy logic. A refcount protects the QP while it is in the rcd queue so it isn't going anywhere yet. If the direct send fails to allocate a pio buffer, hfi1_schedule_send() is called to trigger sending an ack from the send engine. There is no state test in that code path. The refcount is then dropped from the driver.c caller potentially allowing the qp destroy to continue from its refcount wait in parallel with the workqueue scheduling of the qp. Cc: stable@vger.kernel.org Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Combine shift copy and byte copy for SGE readsSebastian Sanchez1-137/+23
Prevent over-reading the SGE length by using byte reads for non quad-word reads. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Do not read more than a SGE lengthSebastian Sanchez1-48/+40
In certain cases, if the tail of an SGE is not 8-byte aligned, bytes beyond the end to an 8-byte alignment can be read. Change the copy routine to avoid the over-read. Instead, stop on the final whole quad-word, then read the remaining bytes. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Extend i2c timeoutDean Luick1-1/+1
Allow a longer timeout for i2c due to clock stretching and inaccurate jiffy timing when under a spin lock. This timeout is consistent with other i2c-algo-bit users. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Increase default settings of max_cqes and max_qpsJianxin Xiong1-2/+2
The ib_write_bw test allows using up to 16384 QPs. When a relatively large number of QPs (within that range) is used, the test can fail because the number of CQ entries needed exceeds the limit set by the driver. This patch increases the default setting of max_cqes from 0x2FFFF (196607) to 0x2FFFFF(3145727), which is sufficient to cover the maximum number needed by the ib_write_bw test (2097152). The default setting of max_qps is also increased from 16384 to 32768 to allow the test to run successfully with 16383 or 16384 QPs. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Remove filtering of Set(PkeyTable) in HFI SMASebastian Sanchez1-6/+0
The FM should have full control to set the pkeys in the driver pkey table. Remove filtering done by the driver. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/qib: Remove qpt_mask globalDennis Dalessandro3-13/+3
There is no need to have a global qpt_mask as that does not support the multiple chip model which qib has. Instead rely on the value which exists already in the device data (dd). Fixes: 898fa52b4ac3 "IB/qib: Remove qpn, qp tables and related variables from qib" Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Consolidate pio control masks into single definitionMike Marciniszyn4-30/+32
This allows for adding additional pages of adaptive pio opcode control including manufacturer specific ones. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/rdmavt, IB/hfi1: Add lockdep asserts for lock debugMike Marciniszyn3-2/+30
This patch adds lockdep asserts in key code paths for insuring lock correctness. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/rdmavt: Add qp init functionMike Marciniszyn1-42/+58
Add an rvt_qp_init() to initialize specific common fields as the qp is created or reset. The routine is shared by the rvt_reset_qp() and the rvt_create_qp(). The intent is that lock dep assertions will only appear in the rvt_reset_qp(). Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/rdmavt: Move reset calldown to reset pathMike Marciniszyn1-6/+5
The reset calldown is misplaced. It should only be called in the code that actually transitions the QP to reset. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Move iowait_init() to priv allocateMike Marciniszyn1-7/+7
The call is misplaced in the reset calldown function and causes issues with lockdep assertions that are to be added. Fixes: Commit a2c2d608957c ("staging/rdma/hfi1: Remove create_qp functionality") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/rdmavt: Correct sparse annotationMike Marciniszyn1-6/+3
The __must_hold() is sufficent to correct the sparse context imbalance inside a function. Per Documentation/sparse.txt: __must_hold - The specified lock is held on function entry and exit. Fixes: Commit c0a67f6ba356 ("IB/rdmavt: Annotate rvt_reset_qp()") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Fix locking scheme for affinity settingsTadeusz Struk2-43/+51
Existing locking scheme in affinity.c file using the &node_affinity.lock spinlock is not very elegant. We acquire the lock to get hfi1_affinity_node entry, unlock, and then use the entry without the lock held. With more functions being added, which access and modify the entries, this can lead to race conditions. This patch makes this locking scheme more consistent. It changes the spinlock to mutex. Since all the code is executed in a user process context there is no need for a spinlock. This also allows to keep the lock not only while we look up for the node affinity entry, but over the whole section where the entry is being used. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Fix user-space buffers mapping with IOMMU enabledTymoteusz Kielan7-63/+79
The dma_XXX API functions return bus addresses which are physical addresses when IOMMU is disabled. Buffer mapping to user-space is done via remap_pfn_range() with PFN based on bus address instead of physical. This results in wrong pages being mapped to user-space when IOMMU is enabled. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Tymoteusz Kielan <tymoteusz.kielan@intel.com> Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Fix the count of user packets submitted to an SDMA engineHarish Chegondi3-28/+30
Each user SDMA request coming into the driver may contain multiple packets. Each user packet may use multiple SDMA descriptors to fill the send buffer. The field seqsubmitted in struct user_sdma_request counts the number of user packets submitted to an SDMA engine. Sometimes, the intermediate count may not be updated properly. However, once all the packets' descriptors are successfully submitted to the SDMA engine, the final count is updated correctly. But, if only some of the packets are submitted to the engine due to an error, the intermediate count doesn't reflect the partial number of packets submitted to the SDMA engine. This can cause a hang later in the code as the count of packets submitted to the SDMA engine doesn't match the the count of packets processed by the SDMA engine. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/hfi1: Move serdes tune inside link start functionDean Luick2-15/+8
All calls to tune_serdes and start_link are paired. Move tune_serdes inside start_link. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-02IB/qib,IB/hfi: Use core common header fileMike Marciniszyn23-346/+162
Use common header file structs, defines, and accessors in the drivers. The old declarations are removed. The repositioning of the includes allows for the removal of hfi1_message_header and replaces its use with ib_header. Also corrected are two issues with set_armed_to_active(): - The "packet" parameter is now a pointer as it should have been - The etype is validated to insure that the header is correct Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller18-81/+200
2016-09-19chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ↵Hariprasad Shenai1-0/+4
ULD's Allocate resources dynamically to cxgb4's Upper layer driver's(ULD) like cxgbit, iw_cxgb4 and cxgb4i. Allocate resources when they register with cxgb4 driver and free them while unregistering. All the queues and the interrupts for them will be allocated during ULD probe only and freed during remove. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-16Merge tag 'for-linus' of ↵Linus Torvalds15-80/+189
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "Round three of 4.8 rc fixes. This is likely the last rdma pull request this cycle. The new rxe driver had a few issues (you probably saw the boot bot bug report) and they should be addressed now. There are a couple other fixes here, mainly mlx4. There are still two outstanding issues that need resolved but I don't think their fix will make this kernel cycle. Summary: - Various fixes to rdmavt, ipoib, mlx5, mlx4, rxe" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/rdmavt: Don't vfree a kzalloc'ed memory region IB/rxe: Fix kmem_cache leak IB/rxe: Fix race condition between requester and completer IB/rxe: Fix duplicate atomic request handling IB/rxe: Fix kernel panic in udp_setup_tunnel IB/mlx5: Set source mac address in FTE IB/mlx5: Enable MAD_IFC commands for IB ports only IB/mlx4: Diagnostic HW counters are not supported in slave mode IB/mlx4: Use correct subnet-prefix in QP1 mads under SR-IOV IB/mlx4: Fix code indentation in QP1 MAD flow IB/mlx4: Fix incorrect MC join state bit-masking on SR-IOV IB/ipoib: Don't allow MC joins during light MC flush IB/rxe: fix GFP_KERNEL in spinlock context
2016-09-16IB/rdmavt, IB/qib, IB/hfi1: Use new QP put get routinesMike Marciniszyn7-26/+20
This improves readability and hides the reference count mechanism from the client drivers. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16IB/rdmavt: Don't vfree a kzalloc'ed memory regionColin Ian King1-1/+1
The userspace memory region 'mr' is allocated with kzalloc in __rvt_alloc_mr however it is incorrectly being freed with vfree in __rvt_free_mr. Fix this by using kfree to free it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16IB/rxe: Fix kmem_cache leakYonatan Cohen1-0/+13
Decrement qp reference when handling error path in completer to prevent kmem_cache leak. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16IB/rxe: Fix race condition between requester and completerYonatan Cohen1-13/+44
rxe_requester() is sending a pkt with rxe_xmit_packet() and then calls rxe_update() to update the wqe and qp's psn values. But sometimes the response is received before the requester had time to update the wqe in which case the completer acts on errornous wqe values. This fix updates the wqe and qp before actually sending the request and rolls back when xmit fails. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16IB/rxe: Fix duplicate atomic request handlingYonatan Cohen1-5/+6
When handling ack for atomic opcodes like "fetch&add" or "cmp&swp", the method send_atomic_ack() saves the ack before sending it, in case it gets lost and never reach the requester. In which case the method duplicate_request() will need to find it using the duplicated request.psn. But send_atomic_ack() used a wrong psn value and thus the above ack was never found. This fix uses the ack.psn to locate the ack in case its needed. This fix also copies the ack packet to the skb's control buffer since duplicate_request() will need it when calling rxe_xmit_packet() Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>