summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/qlogic/qed/qed_dev.c
AgeCommit message (Collapse)AuthorFilesLines
2020-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-1/+10
Two minor conflicts: 1) net/ipv4/route.c, adding a new local variable while moving another local variable and removing it's initial assignment. 2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes. One pretty prints the port mode differently, whilst another changes the driver to try and obtain the port mode from the port node rather than the switch node. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09net: qed: Disable aRFS for NPAR and 100GDmitry Bogdanov1-1/+10
In CMT and NPAR the PF is unknown when the GFS block processes the packet. Therefore cannot use searcher as it has a per PF database, and thus ARFS must be disabled. Fixes: d51e4af5c209 ("qed: aRFS infrastructure support") Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-2/+2
We got slightly different patches removing a double word in a comment in net/ipv4/raw.c - picked the version from net. Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached values instead of VNIC login response buffer (following what commit 507ebe6444a4 ("ibmvnic: Fix use-after-free of VNIC login response buffer") did). Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-08-24qed: implement devlink info requestIgor Russkikh1-0/+9
Here we return existing fw & mfw versions, we also fetch device's serial number: ~$ sudo ~/iproute2/devlink/devlink dev info pci/0000:01:00.1: driver qed board.serial_number REE1915E44552 versions: running: fw.app 8.42.2.0 stored: fw.mgmt 8.52.10.0 MFW and FW are different firmwares on device. Management is a firmware responsible for link configuration and various control plane features. Its permanent and resides in NVM. Running FW (or fastpath FW) is an embedded microprogram implementing all the packet processing, offloads, etc. This FW is being loaded on each start by the driver from FW binary blob. The base device specific structure (qed_dev_info) was not directly available to the base driver before. Thus, here we create and store a private copy of this structure in qed_dev root object to access the data. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva1-2/+2
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-1/+1
The UDP reuseport conflict was a little bit tricky. The net-next code, via bpf-next, extracted the reuseport handling into a helper so that the BPF sk lookup code could invoke it. At the same time, the logic for reuseport handling of unconnected sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace which changed the logic to carry on the reuseport result into the rest of the lookup loop if we do not return immediately. This requires moving the reuseport_has_conns() logic into the callers. While we are here, get rid of inline directives as they do not belong in foo.c files. The other changes were cases of more straightforward overlapping modifications. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22qed: move chain methods to a separate fileAlexander Lobakin1-273/+0
Move chain allocation/freeing functions to a new file to not mix it with hardware-related code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21qed: suppress false-positives interrupt error messages on HW initAlexander Lobakin1-1/+1
It was found that qed_pglueb_rbc_attn_handler() can produce a lot of false-positive error detections on driver load/reload (especially after crashes/recoveries) and spam the kernel log: [ 4.958275] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d00ff0 [ 2079.146764] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0 [ 2116.374631] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0 [ 2135.250564] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0 [...] Reduce the logging level of two false-positive prone error messages from notice to verbose on initialization (only) to not mix it with real error attentions while debugging. Fixes: 666db4862f2d ("qed: Revise load sequence to avoid PCI errors") Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20qed: add support for the extended speed and FEC modesAlexander Lobakin1-9/+93
Add all necessary code (NVM parsing, MFW and Ethtool reports etc.) to support extended speed and FEC modes. These new modes are supported by the new boards revisions and newer MFW versions. Misc: correct port type for MEDIA_KR. Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20qed: add support for new port modesAlexander Lobakin1-0/+5
These ports ship on new boards revisions and are supported by newer firmware versions. Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20qed: remove unused qed_hw_info::port_mode and QED_PORT_MODEAlexander Lobakin1-21/+0
Struct field qed_hw_info::port_mode isn't used anywhere in the code, so can be safely removed to prevent possible dead code addition. Also remove the enumeration QED_PORT_MODE orphaned after this deletion. Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20qed: add support for Forward Error CorrectionAlexander Lobakin1-17/+37
Add all necessary routines for reading supported FEC modes from NVM and querying FEC control to the MFW (if the running version supports it). Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-9/+3
All conflicts seemed rather trivial, with some guidance from Saeed Mameed on the tc_ct.c one. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09qed: Populate nvm-file attributes while reading nvm config partition.Sudarsana Reddy Kalluru1-9/+3
NVM config file address will be modified when the MBI image is upgraded. Driver would return stale config values if user reads the nvm-config (via ethtool -d) in this state. The fix is to re-populate nvm attribute info while reading the nvm config values/partition. Changes from previous version: ------------------------------- v3: Corrected the formatting in 'Fixes' tag. v2: Added 'Fixes' tag. Fixes: 1ac4329a1cff ("qed: Add configuration information to register dump and debug data") Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: qed: update copyright yearsAlexander Lobakin1-0/+1
Set the actual copyright holder and years in all qed source files. Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30net: qed: convert to SPDX License IdentifiersAlexander Lobakin1-28/+1
QLogic QED drivers source code is dual licensed under GPL-2.0/BSD-3-Clause. Remove all the boilerplates in the existing code and replace it with the correct SPDX tag. Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net: qed: fix "maybe uninitialized" warningAlexander Lobakin1-1/+1
Variable 'abs_ppfid' in qed_dev.c:qed_llh_add_mac_filter() always gets printed, but is initialized only under 'ref_cnt == 1' condition. This results in: In file included from ./include/linux/kernel.h:15:0, from ./include/asm-generic/bug.h:19, from ./arch/x86/include/asm/bug.h:86, from ./include/linux/bug.h:5, from ./include/linux/io.h:11, from drivers/net/ethernet/qlogic/qed/qed_dev.c:35: drivers/net/ethernet/qlogic/qed/qed_dev.c: In function 'qed_llh_add_mac_filter': ./include/linux/printk.h:358:2: warning: 'abs_ppfid' may be used uninitialized in this function [-Wmaybe-uninitialized] printk(KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__) ^~~~~~ drivers/net/ethernet/qlogic/qed/qed_dev.c:983:17: note: 'abs_ppfid' was declared here u8 filter_idx, abs_ppfid; ^~~~~~~~~ ...under W=1+. Fix this by initializing it with zero. Fixes: 79284adeb99e ("qed: Add llh ppfid interface and 100g support for offload protocols") Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net: qed: fix async event callbacks unregisteringAlexander Lobakin1-2/+7
qed_spq_unregister_async_cb() should be called before qed_rdma_info_free() to avoid crash-spawning uses-after-free. Instead of calling it from each subsystem exit code, do it in one place on PF down. Fixes: 291d57f67d24 ("qed: Fix rdma_info structure allocation") Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21qed: changes to ILT to support XRCYuval Basson1-1/+5
First ILT page for TSDM client is allocated for XRC-SRQ's. For regular SRQ's skip first ILT page that is reserved for XRC-SRQ's. Signed-off-by: Michal Kalderon <mkalderon@marvell.com> Signed-off-by: Yuval Bason <ybason@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14net: qed: invoke err notify on critical areasIgor Russkikh1-1/+3
In a number of critical places not only debug trace should be printed, but the appropriate hw error condition should be raised and error handling/recovery should start. Introduce our new qed_hw_err_notify invocation in these places to record and indicate critical error conditions in hardware. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-20qed: use true,false for bool variablesJason Yan1-2/+2
Fix the following coccicheck warning: drivers/net/ethernet/qlogic/qed/qed_dev.c:4395:2-34: WARNING: Assignment of 0/1 to bool variable drivers/net/ethernet/qlogic/qed/qed_dev.c:1975:2-34: WARNING: Assignment of 0/1 to bool variable Signed-off-by: Jason Yan <yanaijie@huawei.com> Acked-by: Michal KalderonĀ <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30qed: Fix use after free in qed_chain_freeYuval Basson1-22/+16
The qed_chain data structure was modified in commit 1a4a69751f4d ("qed: Chain support for external PBL") to support receiving an external pbl (due to iWARP FW requirements). The pages pointed to by the pbl are allocated in qed_chain_alloc and their virtual address are stored in an virtual addresses array to enable accessing and freeing the data. The physical addresses however weren't stored and were accessed directly from the external-pbl during free. Destroy-qp flow, leads to freeing the external pbl before the chain is freed, when the chain is freed it tries accessing the already freed external pbl, leading to a use-after-free. Therefore we need to store the physical addresses in additional to the virtual addresses in a new data structure. Fixes: 1a4a69751f4d ("qed: Chain support for external PBL") Signed-off-by: Michal Kalderon <mkalderon@marvell.com> Signed-off-by: Yuval Bason <ybason@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-01qed: Fix a error code in qed_hw_init()Dan Carpenter1-0/+1
If the qed_fw_overlay_mem_alloc() then we should return -ENOMEM instead of success. Fixes: 30d5f85895fa ("qed: FW 8.42.2.0 Add fw overlay feature") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-27qed: FW 8.42.2.0 debug featuresMichal Kalderon1-1/+1
Add to debug dump more information on the platform it was collected from (pci func, path id). Provide human readable reg fifo erros. Removed static debug arrays from HSI Functions, and move them to the hwfn. Some structures were slightly changed (removing reserved chip id for example) which lead to many long initializations being modified with one parameter less during initialization. This leads to some long diffs that don't really change anything. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-27qed: rt init valid initialization changedMichal Kalderon1-3/+0
The QM phase init tool can be invoked multiple times during the driver lifetime. Part of the init comes from the runtime array. The logic for setting the values did not init all values, basically assuming the runtime array was all zeroes. But if it was invoked multiple times, nobody was zeroing it after the first time. In this change we zero the runtime array right after using it. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-27qed: FW 8.42.2.0 Add fw overlay featureMichal Kalderon1-1/+17
This feature enables the FW to page out FW code when required Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-27qed: Add abstraction for different hsi values per chipMichal Kalderon1-20/+42
The number of BTB blocks was modified to be different between the two chip flavors supported (BB/K2) as a result, this lead to a re-write of selecting the default hsi value based on the chip. This patch creates a lookup table for hsi values per chip rather than ask again and again for every value. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-27qed: FW 8.42.2.0 Additional ll2 typeMichal Kalderon1-6/+14
LL2 queues were a limited resource due to FW constraints. This FW introduced a new resource which is a context based ll2 queue (memory on host). The additional ll2 queues are required for RDMA SRIOV. The code refers to the previous ll2 queues as ram-based or legacy, and the new queues as ctx-based. This change decreased the "legacy" ram-based queues therefore the first ll2 queue used for iWARP was converted to the ctx-based ll2 queue. This feature also exposed a bug in the DIRECT_REG_WR64 macro implementation which didn't have an effect in other use cases. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-27qed: Use dmae to write to widebus registers in fw_funcsMichal Kalderon1-1/+1
There are several wide-bus registers written to by the fw_funcs that require using the dmae for atomicity. Therefore using the dmae channel functionality was added to the fw_funcs file, since the code is very similar to the previously used code, the structures used were moved to qed_hsi. Due to FW conventions, the names of the flags in the struct changed. Since this required slight modification in the places that set the flags the code was modified to use GET/SET FIELD macros. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-27qed: FW 8.42.2.0 Queue Manager changesMichal Kalderon1-11/+10
This patch contains changes in initialization and usage of the QM blocks. Instead of setting a rate limiter per vport the rate limiters are now a global resource and set independentaly. The patch also contains a field name change: vport_wfq which is part of vport_params was renamed to wfq as the vport prefix is redundant. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18qed: Fix -Wmaybe-uninitialized false positiveArnd Bergmann1-0/+1
A previous attempt to shut up the uninitialized variable use warning was apparently insufficient. When CONFIG_PROFILE_ANNOTATED_BRANCHES is set, gcc-8 still warns, because the unlikely() check in DP_NOTICE() causes it to no longer track the state of all variables correctly: drivers/net/ethernet/qlogic/qed/qed_dev.c: In function 'qed_llh_set_ppfid_affinity': drivers/net/ethernet/qlogic/qed/qed_dev.c:798:47: error: 'abs_ppfid' may be used uninitialized in this function [-Werror=maybe-uninitialized] addr = NIG_REG_PPF_TO_ENGINE_SEL + abs_ppfid * 0x4; ~~~~~~~~~~^~~~~ This is not a nice workaround, but always initializing the output from qed_llh_abs_ppfid() at least shuts up the false positive reliably. Fixes: 79284adeb99e ("qed: Add llh ppfid interface and 100g support for offload protocols") Fixes: 8e2ea3ea9625 ("qed: Fix static checker warning") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michal KalderonĀ <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30qed: Fix static checker warningMichal Kalderon1-12/+12
In some cases abs_ppfid could be printed without being initialized. Fixes: 79284adeb99e ("qed: Add llh ppfid interface and 100g support for offload protocols") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29qed: fix spelling mistake "inculde" -> "include"Colin Ian King1-1/+1
There is a spelling mistake in a DP_INFO message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Michal KalderonĀ <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-26qed: Set the doorbell address correctlyMichal Kalderon1-11/+18
In 100g mode the doorbell bar is united for both engines. Set the correct offset in the hwfn so that the doorbell returned for RoCE is in the affined hwfn. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Denis Bolotin <denis.bolotin@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-26qed: Add llh ppfid interface and 100g support for offload protocolsMichal Kalderon1-263/+983
This patch refactors the current llh implementation. It exposes a hw resource called ppfid (port-pfid) and implements an API for configuring the resource. Default configuration which was used until now limited the number of filters per PF and did not support engine affinity per protocol. The new API enables allocating more filter rules per PF and enables affinitizing protocol packets to a certain engine which enables full 100g protocol offload support. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-51/+34
Conflict resolution of af_smc.c from Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14qed: Fix the doorbell address sanity checkDenis Bolotin1-8/+8
Fix the condition which verifies that doorbell address is inside the doorbell bar by checking that the end of the address is within range as well. Signed-off-by: Denis Bolotin <dbolotin@marvell.com> Signed-off-by: Michal Kalderon <mkalderon@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-14qed: Delete redundant doorbell recovery typesDenis Bolotin1-43/+26
DB_REC_DRY_RUN (running doorbell recovery without sending doorbells) is never used. DB_REC_ONCE (send a single doorbell from the doorbell recovery) is not needed anymore because by running the periodic handler we make sure we check the overflow status later instead. This patch is needed because in the next patches, the only doorbell recovery type being used is DB_REC_REAL_DEAL, and the fixes are much cleaner without this enum. Signed-off-by: Denis Bolotin <dbolotin@marvell.com> Signed-off-by: Michal Kalderon <mkalderon@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20qed: Define new MF bit for no_vlan configSudarsana Reddy Kalluru1-2/+4
The patch introduces a new Multi-Function bit for cases where firmware shouldn't perform the insertion of vlan-0 tag. The new bit is defined to abstract the implementation from the actual MF mode. Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Michal Kalderon <mkalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-21qed: Read device port count from the shmemSudarsana Reddy Kalluru1-60/+35
Read port count from the shared memory instead of driver deriving this value. This change simplifies the driver implementation and also avoids any dependencies for finding the port-count. Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Michal Kalderon <mkalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-4/+4
2019-01-28qed: Fix stack out of bounds bugManish Chopra1-4/+4
KASAN reported following bug in qed_init_qm_get_idx_from_flags due to inappropriate casting of "pq_flags". Fix the type of "pq_flags". [ 196.624707] BUG: KASAN: stack-out-of-bounds in qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed] [ 196.624712] Read of size 8 at addr ffff809b00bc7360 by task kworker/0:9/1712 [ 196.624714] [ 196.624720] CPU: 0 PID: 1712 Comm: kworker/0:9 Not tainted 4.18.0-60.el8.aarch64+debug #1 [ 196.624723] Hardware name: To be filled by O.E.M. Saber/Saber, BIOS 0ACKL024 09/26/2018 [ 196.624733] Workqueue: events work_for_cpu_fn [ 196.624738] Call trace: [ 196.624742] dump_backtrace+0x0/0x2f8 [ 196.624745] show_stack+0x24/0x30 [ 196.624749] dump_stack+0xe0/0x11c [ 196.624755] print_address_description+0x68/0x260 [ 196.624759] kasan_report+0x178/0x340 [ 196.624762] __asan_report_load_n_noabort+0x38/0x48 [ 196.624786] qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed] [ 196.624808] qed_init_qm_info+0xec0/0x2200 [qed] [ 196.624830] qed_resc_alloc+0x284/0x7e8 [qed] [ 196.624853] qed_slowpath_start+0x6cc/0x1ae8 [qed] [ 196.624864] __qede_probe.isra.10+0x1cc/0x12c0 [qede] [ 196.624874] qede_probe+0x78/0xf0 [qede] [ 196.624879] local_pci_probe+0xc4/0x180 [ 196.624882] work_for_cpu_fn+0x54/0x98 [ 196.624885] process_one_work+0x758/0x1900 [ 196.624888] worker_thread+0x4e0/0xd18 [ 196.624892] kthread+0x2c8/0x350 [ 196.624897] ret_from_fork+0x10/0x18 [ 196.624899] [ 196.624902] Allocated by task 2: [ 196.624906] kasan_kmalloc.part.1+0x40/0x108 [ 196.624909] kasan_kmalloc+0xb4/0xc8 [ 196.624913] kasan_slab_alloc+0x14/0x20 [ 196.624916] kmem_cache_alloc_node+0x1dc/0x480 [ 196.624921] copy_process.isra.1.part.2+0x1d8/0x4a98 [ 196.624924] _do_fork+0x150/0xfa0 [ 196.624926] kernel_thread+0x48/0x58 [ 196.624930] kthreadd+0x3a4/0x5a0 [ 196.624932] ret_from_fork+0x10/0x18 [ 196.624934] [ 196.624937] Freed by task 0: [ 196.624938] (stack is not available) [ 196.624940] [ 196.624943] The buggy address belongs to the object at ffff809b00bc0000 [ 196.624943] which belongs to the cache thread_stack of size 32768 [ 196.624946] The buggy address is located 29536 bytes inside of [ 196.624946] 32768-byte region [ffff809b00bc0000, ffff809b00bc8000) [ 196.624948] The buggy address belongs to the page: [ 196.624952] page:ffff7fe026c02e00 count:1 mapcount:0 mapping:ffff809b4001c000 index:0x0 compound_mapcount: 0 [ 196.624960] flags: 0xfffff8000008100(slab|head) [ 196.624967] raw: 0fffff8000008100 dead000000000100 dead000000000200 ffff809b4001c000 [ 196.624970] raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000 [ 196.624973] page dumped because: kasan: bad access detected [ 196.624974] [ 196.624976] Memory state around the buggy address: [ 196.624980] ffff809b00bc7200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 196.624983] ffff809b00bc7280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 196.624985] >ffff809b00bc7300: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 [ 196.624988] ^ [ 196.624990] ffff809b00bc7380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 196.624993] ffff809b00bc7400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 196.624995] ================================================================== Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28qed: Add infrastructure for error detection and recoveryTomer Tayar1-14/+27
This patch adds the detection and handling of a parity error ("process kill event"), including the update of the protocol drivers, and the prevention of any HW access that will lead to device access towards the host while recovery is in progress. It also provides the means for the protocol drivers to trigger a recovery process on their decision. Signed-off-by: Tomer Tayar <tomer.tayar@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28qed: Revise load sequence to avoid PCI errorsTomer Tayar1-49/+68
Initiating final cleanup after an ungraceful driver unload can lead to bad PCI accesses towards the host. This patch revises the load sequence so final cleanup is sent while the internal master enable is cleared, to prevent the host accesses, and clears the internal error indications just before enabling the internal master enable. Signed-off-by: Tomer Tayar <tomer.tayar@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-04qed: fix spelling mistake "Dispalying" -> "Displaying"Colin Ian King1-1/+1
There is a spelling mistake in a DP_NOTICE message, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30qed: Use the doorbell overflow recovery mechanism in case of doorbell overflowAriel Elior1-3/+11
In case of an attention from the doorbell queue block, analyze the HW indications. In case of a doorbell overflow, execute a doorbell recovery. Since there can be spurious indications (race conditions between multiple PFs), schedule a periodic task for checking whether a doorbell overflow may have been missed. After a set time with no indications, terminate the periodic task. Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-30qed: Add doorbell overflow recovery mechanismAriel Elior1-0/+320
Add the database used to register doorbelling entities, and APIs for adding and deleting entries, and logic for traversing the database and doorbelling once on behalf of all entities. Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19qed: Fix QM getters to always return a valid pqDenis Bolotin1-4/+20
The getter callers doesn't know the valid Physical Queues (PQ) values. This patch makes sure that a valid PQ will always be returned. The patch consists of 3 fixes: - When qed_init_qm_get_idx_from_flags() receives a disabled flag, it returned PQ 0, which can potentially be another function's pq. Verify that flag is enabled, otherwise return default start_pq. - When qed_init_qm_get_idx_from_flags() receives an unknown flag, it returned NULL and could lead to a segmentation fault. Return default start_pq instead. - A modulo operation was added to MCOS/VFS PQ getters to make sure the PQ returned is in range of the required flag. Fixes: b5a9ee7cf3be ("qed: Revise QM cofiguration") Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19qed: Fix bitmap_weight() checkDenis Bolotin1-1/+4
Fix the condition which verifies that only one flag is set. The API bitmap_weight() should receive size in bits instead of bytes. Fixes: b5a9ee7cf3be ("qed: Revise QM cofiguration") Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-13qed: Fix rdma_info structure allocationMichal Kalderon1-4/+11
Certain flows need to access the rdma-info structure, for example dcbx update flows. In some cases there can be a race between the allocation or deallocation of the structure which was done in roce start / roce stop and an asynchrounous dcbx event that tries to access the structure. For this reason, we move the allocation of the rdma_info structure to be similar to the iscsi/fcoe info structures which are allocated during device setup. We add a new field of "active" to the struct to define whether roce has already been started or not, and this is checked instead of whether the pointer to the info structure. Fixes: 51ff17251c9c ("qed: Add support for RoCE hw init") Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>