Age | Commit message (Collapse) | Author | Files | Lines |
|
Define a new field in the IPA structure that records the maximum
number of entries that will be used in the IPA endpoint array. Use
that value rather than IPA_ENDPOINT_MAX to determine the end
condition for two loops that iterate over all endpoints.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Each endpoint ID has an entry in the IPA endpoint array. But the
size of that array is defined at compile time. Instead, rename
ipa_endpoint_data_valid() to be ipa_endpoint_max() and have it
return the maximum endpoint ID defined in configuration data.
That function will still validate configuration data.
Zero is returned on error; it's a valid endpoint ID, but we need
more than one, so it can't be the maximum. The next patch makes use
of the returned maximum value.
Finally, rename the "initialized" mask of endpoints defined by
configuration data to be "defined".
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Change two functions that iterate over all endpoints to use while
loops, using "endpoint_id" as the index variables in both spots.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ensure all defined TX endpoints are in the range [0, CONS_PIPES) and
defined RX endpoints are within [PROD_LOWEST, PROD_LOWEST+PROD_PIPES).
Modify the way local variables are used to make the checks easier
to understand. Check for each endpoint being in valid range in the
loop, and drop the logical-AND check of initialized against
unavailable IDs.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
IPA v5.0 eliminates the global filter table entry. As a result,
there is no need to shift the filtered endpoint bitmap when it is
written to IPA local memory.
Update comments to explain this. Also delete a redundant block of
comments above the function.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Don't require IPA v5.0 to have a STATS_TETHERING memory region.
Downstream defines its size to 0, so it apparently is unused.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In preparation for adding support for IPA v5.0, define it as an
understood version.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the event of a Tx hang it can be useful to read a variety of hardware
registers to capture some state about why the transmit queue got stuck.
Extend the ETHTOOL_GREGS dump provided by the ice driver with several CSR
registers that provide such relevant information regarding the hardware Tx
state. This enables capturing relevant data to enable debugging such a Tx
hang.
Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Link: https://lore.kernel.org/r/20221027104239.1691549-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As a result of help from Frank Wunderlich to investigate and test, we
now know how to program this PCS for in-band 802.3z negotiation. Add
support for this by moving the contents of the two functions into the
common mtk_pcs_config() function and adding the register settings for
802.3z negotiation.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Program the link timer appropriately for the interface mode being
used, using the newly introduced phylink helper that provides the
nanosecond link timer interval.
The intervals are 1.6ms for SGMII based protocols and 10ms for
802.3z based protocols.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Program the advertisement into the mtk PCS block.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move the selection of the underlying interface speed to the pcs_config
function, so we always program the interface speed.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The PHY power up is common to both configuration paths, so move it into
the parent function. We need to do this for all serdes modes.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add support for forcing the link speed and duplex setting in the
pcs_link_up() method for out of band modes, which will be useful when
we finish converting the pcs_config() method. Until then, we still have
to force duplex for 802.3z modes to work correctly.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mtk_sgmii does a lot of read-modify-write operations, for which there
is a specific regmap function. Use this function instead of open-coding
the operations.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a pcs_get_state() implementation which uses the advertisements
to compute the resulting link modes, and BMSR contents to determine
negotiation and link status.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The functions called by the pcs_config() method always return zero, so
there is no point trying to handle an error from these functions. Make
these functions void, eliminate the "err" variable and simply return
zero from the pcs_config() function itself.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As a result of help from Frank Wunderlich to investigate and test, we
know a bit more about the PCS on the Mediatek platforms. Update the
definitions from this investigation.
This PCS appears similar, but not identical to the Lynx PCS.
Although not included in this patch, but for future reference, the PHY
ID registers at offset 4 read as 0x4d544950 'MTIP'.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now that the 32bit UP oddity is gone and 32bit uses always a sequence
count, there is no need for the fetch_irq() variants anymore.
Convert to the regular interface.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
====================
pull-request: wireless-next-2022-10-28
First set of patches v6.2. mac80211 refactoring continues for Wi-Fi 7.
All mac80211 driver are now converted to use internal TX queues, this
might cause some regressions so we wanted to do this early in the
cycle.
Note: wireless tree was merged[1] to wireless-next to avoid some
conflicts with mac80211 patches between the trees. Unfortunately there
are still two smaller conflicts in net/mac80211/util.c which Stephen
also reported[2]. In the first conflict initialise scratch_len to
"params->scratch_len ?: 3 * params->len" (note number 3, not 2!) and
in the second conflict take the version which uses elems->scratch_pos.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next.git/commit/?id=dfd2d876b3fda1790bc0239ba4c6967e25d16e91
[2] https://lore.kernel.org/all/20221020032340.5cf101c0@canb.auug.org.au/
mac80211
- preparation for Wi-Fi 7 Multi-Link Operation (MLO) continues
- add API to show the link STAs in debugfs
- all mac80211 drivers are now using mac80211 internal TX queues (iTXQs)
rtw89
- support 8852BE
rtl8xxxu
- support RTL8188FU
brmfmac
- support two station interfaces concurrently
bcma
- support SPROM rev 11
====================
Link: https://lore.kernel.org/r/20221028132943.304ECC433B5@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Because we enable the clock immediately after acquiring it in probe,
we can combine the 2 operations and use devm_clk_get_optional_enabled()
helper.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add MAC address related operations, and register netdev.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reset and initialize the hardware by configuring the MAC layer.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Get PCI config space info, set LAN id and check flash status.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for MDI-X status and configuration for GPY211 chips
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
gpy_update_interface() is called from gpy_read_status() which does
return error codes. gpy_read_status() would benefit from returning
-EINVAL, etc.
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
./drivers/net/ethernet/freescale/dpaa2/dpaa2-xsk.c:453:42-47: WARNING: conversion to bool not needed here
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2577
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221026051824.38730-1-yang.lee@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same pre-work code is used before each call to
ionic_rx_fill(), so bring it in and make it a part of
the routine.
Signed-off-by: Neel Patel <neel@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Support stateless offloads for GRE, VXLAN, GENEVE, IPXIP4
and IPXIP6 when the FW supports them.
Signed-off-by: Neel Patel <neel@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A new ionic dev_cmd is added to the interface in ionic_if.h,
with a new capabilities field in the ionic device identity to
signal its availability in the FW. The identity level code is
incremented to '2' to show support for this new capabilities
bitfield.
If the driver has indicated with the new identity level that
it has the VF_CTRL command, newer FW will wait for the start
command before starting the VFs after a FW update or crash
recovery.
This patch updates the driver to make use of the new VF start
control in fw_up path to be sure that the PF has set the user
attributes on the VF before the FW allows the VFs to restart.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Report the current FW values for the VF attributes, but don't
save the FW values locally, only save the vf attributes that
are given to us from the user. This allows us to replay user
data, and doesn't end up confusing things like "who set the
mac address".
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The VF attributes that the user has set into the FW through
the PF can be lost over a FW crash recovery. Much like we
already replay the PF mac/vlan filters, we now add a replay
in the recovery path to be sure the FW has the up-to-date
VF configurations.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
2871edb32f46 ("can: kvaser_usb: Fix possible completions during init_completion")
abb8670938b2 ("can: kvaser_usb_leaf: Ignore stale bus-off after start")
8d21f5927ae6 ("can: kvaser_usb_leaf: Fix improved state not being reported")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from 802.15.4 (Zigbee et al).
Current release - regressions:
- ipa: fix bugs in the register conversion for IPA v3.1 and v3.5.1
Current release - new code bugs:
- mptcp: fix abba deadlock on fastopen
- eth: stmmac: rk3588: allow multiple gmac controllers in one system
Previous releases - regressions:
- ip: rework the fix for dflt addr selection for connected nexthop
- net: couple more fixes for misinterpreting bits in struct page
after the signature was added
Previous releases - always broken:
- ipv6: ensure sane device mtu in tunnels
- openvswitch: switch from WARN to pr_warn on a user-triggerable path
- ethtool: eeprom: fix null-deref on genl_info in dump
- ieee802154: more return code fixes for corner cases in
dgram_sendmsg
- mac802154: fix link-quality-indicator recording
- eth: mlx5: fixes for IPsec, PTP timestamps, OvS and conntrack
offload
- eth: fec: limit register access on i.MX6UL
- eth: bcm4908_enet: update TX stats after actual transmission
- can: rcar_canfd: improve IRQ handling for RZ/G2L
Misc:
- genetlink: piggy back on the newly added resv_op_start to enforce
more sanity checks on new commands"
* tag 'net-6.1-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
net: enetc: survive memory pressure without crashing
kcm: do not sense pfmemalloc status in kcm_sendpage()
net: do not sense pfmemalloc status in skb_append_pagefrags()
net/mlx5e: Fix macsec sci endianness at rx sa update
net/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule function
net/mlx5e: Fix macsec rx security association (SA) update/delete
net/mlx5e: Fix macsec coverity issue at rx sa update
net/mlx5: Fix crash during sync firmware reset
net/mlx5: Update fw fatal reporter state on PCI handlers successful recover
net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed
net/mlx5e: TC, Reject forwarding from internal port to internal port
net/mlx5: Fix possible use-after-free in async command interface
net/mlx5: ASO, Create the ASO SQ with the correct timestamp format
net/mlx5e: Update restore chain id for slow path packets
net/mlx5e: Extend SKB room check to include PTP-SQ
net/mlx5: DR, Fix matcher disconnect error flow
net/mlx5: Wait for firmware to enable CRS before pci_restore_state
net/mlx5e: Do not increment ESN when updating IPsec ESN state
netdevsim: remove dir in nsim_dev_debugfs_init() when creating ports dir failed
netdevsim: fix memory leak in nsim_drv_probe() when nsim_dev_resources_register() failed
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"A bunch of patches addressing issues in the vivid driver and adding
new checks in V4L2 to validate the input parameters from some ioctls"
* tag 'media/v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: vivid.rst: loop_video is set on the capture devnode
media: vivid: set num_in/outputs to 0 if not supported
media: vivid: drop GFP_DMA32
media: vivid: fix control handler mutex deadlock
media: videodev2.h: V4L2_DV_BT_BLANKING_HEIGHT should check 'interlaced'
media: v4l2-dv-timings: add sanity checks for blanking values
media: vivid: dev->bitmap_cap wasn't freed in all cases
media: vivid: s_fbuf: add more sanity checks
|
|
Under memory pressure, enetc_refill_rx_ring() may fail, and when called
during the enetc_open() -> enetc_setup_rxbdr() procedure, this is not
checked for.
An extreme case of memory pressure will result in exactly zero buffers
being allocated for the RX ring, and in such a case it is expected that
hardware drops all RX packets due to lack of buffers.
This does not happen, because the reset-default value of the consumer
and produces index is 0, and this makes the ENETC think that all buffers
have been initialized and that it owns them (when in reality none were).
The hardware guide explains this best:
| Configure the receive ring producer index register RBaPIR with a value
| of 0. The producer index is initially configured by software but owned
| by hardware after the ring has been enabled. Hardware increments the
| index when a frame is received which may consume one or more BDs.
| Hardware is not allowed to increment the producer index to match the
| consumer index since it is used to indicate an empty condition. The ring
| can hold at most RBLENR[LENGTH]-1 received BDs.
|
| Configure the receive ring consumer index register RBaCIR. The
| consumer index is owned by software and updated during operation of the
| of the BD ring by software, to indicate that any receive data occupied
| in the BD has been processed and it has been prepared for new data.
| - If consumer index and producer index are initialized to the same
| value, it indicates that all BDs in the ring have been prepared and
| hardware owns all of the entries.
| - If consumer index is initialized to producer index plus N, it would
| indicate N BDs have been prepared. Note that hardware cannot start if
| only a single buffer is prepared due to the restrictions described in
| (2).
| - Software may write consumer index to match producer index anytime
| while the ring is operational to indicate all received BDs prior have
| been processed and new BDs prepared for hardware.
Normally, the value of rx_ring->rcir (consumer index) is brought in sync
with the rx_ring->next_to_use software index, but this only happens if
page allocation ever succeeded.
When PI==CI==0, the hardware appears to receive frames and write them to
DMA address 0x0 (?!), then set the READY bit in the BD.
The enetc_clean_rx_ring() function (and its XDP derivative) is naturally
not prepared to handle such a condition. It will attempt to process
those frames using the rx_swbd structure associated with index i of the
RX ring, but that structure is not fully initialized (enetc_new_page()
does all of that). So what happens next is undefined behavior.
To operate using no buffer, we must initialize the CI to PI + 1, which
will block the hardware from advancing the CI any further, and drop
everything.
The issue was seen while adding support for zero-copy AF_XDP sockets,
where buffer memory comes from user space, which can even decide to
supply no buffers at all (example: "xdpsock --txonly"). However, the bug
is present also with the network stack code, even though it would take a
very determined person to trigger a page allocation failure at the
perfect time (a series of ifup/ifdown under memory pressure should
eventually reproduce it given enough retries).
Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20221027182925.3256653-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cited commit at rx sa update operation passes the sci object
attribute, in the wrong endianness and not as expected by the HW
effectively create malformed hw sa context in case of update rx sa
consequently, HW produces unexpected MACsec packets which uses this
sa.
Fix by passing sci to create macsec object with the correct endianness,
while at it add __force u64 to prevent sparse check error of type
"sparse: error: incorrect type in assignment".
Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-16-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cited commit produces a sparse check error of type
"sparse: error: restricted __be64 degrades to integer". The
offending line wrongly did a bitwise operation between two different
storage types one of 64 bit when the other smaller side is 16 bit
which caused the above sparse error, furthermore bitwise operation
usage here is wrong in the first place as the constant MACSEC_PORT_ES
is not a bitwise field.
Fix by using the right mask to get the lower 16 bit if the sci number,
and use comparison operator '==' instead of bitwise '&' operator.
Fixes: 3b20949cb21b ("net/mlx5e: Add MACsec RX steering rules")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-15-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cited commit adds the support for update/delete MACsec Rx SA,
naturally, these operations need to check if the SA in question exists
to update/delete the SA and return error code otherwise, however they
do just the opposite i.e. return with error if the SA exists
Fix by change the check to return error in case the SA in question does
not exist, adjust error message and code accordingly.
Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-14-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cited commit at update rx sa operation passes object attributes
to MACsec object create function without initializing/setting all
attributes fields leaving some of them with garbage values, therefore
violating the implicit assumption at create object function, which
assumes that all input object attributes fields are set.
Fix by initializing the object attributes struct to zero, thus leaving
unset fields with the legal zero value.
Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Lior Nahmanson <liorna@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-13-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When setting Bluefield to DPU NIC mode using mlxconfig tool + sync
firmware reset flow, we run into scenario where the host was not
eswitch manager at the time of mlx5 driver load but becomes eswitch manager
after the sync firmware reset flow. This results in null pointer
access of mpfs structure during mac filter add. This change prevents null
pointer access but mpfs table entries will not be added.
Fixes: 5ec697446f46 ("net/mlx5: Add support for devlink reload action fw activate")
Signed-off-by: Suresh Devarakonda <ramad@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-12-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update devlink health fw fatal reporter state to "healthy" is needed by
strictly calling devlink_health_reporter_state_update() after recovery
was done by PCI error handler. This is needed when fw_fatal reporter was
triggered due to PCI error. Poll health is called and set reporter state
to error. Health recovery failed (since EEH didn't re-enable the PCI).
PCI handlers keep on recover flow and succeed later without devlink
acknowledgment. Fix this by adding devlink state update at the end of
the PCI handler recovery process.
Fixes: 6181e5cb752e ("devlink: add support for reporter recovery completion")
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-11-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On multi table split the driver creates a new attr instance with
data being copied from prev attr instance zeroing action flags.
Also need to reset dests properties to avoid incorrect dests per attr.
Fixes: 8300f225268b ("net/mlx5e: Create new flow attr for multi table actions")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-10-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Reject TC rules that forward from internal port to internal port
as it is not supported.
This include rules that are explicitly have internal port as
the filter device as well as rules that apply on tunnel interfaces
as the route device for the tunnel interface can be an internal
port.
Fixes: 27484f7170ed ("net/mlx5e: Offload tc rules that redirect to ovs internal port")
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-9-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mlx5_cmd_cleanup_async_ctx should return only after all its callback
handlers were completed. Before this patch, the below race between
mlx5_cmd_cleanup_async_ctx and mlx5_cmd_exec_cb_handler was possible and
lead to a use-after-free:
1. mlx5_cmd_cleanup_async_ctx is called while num_inflight is 2 (i.e.
elevated by 1, a single inflight callback).
2. mlx5_cmd_cleanup_async_ctx decreases num_inflight to 1.
3. mlx5_cmd_exec_cb_handler is called, decreases num_inflight to 0 and
is about to call wake_up().
4. mlx5_cmd_cleanup_async_ctx calls wait_event, which returns
immediately as the condition (num_inflight == 0) holds.
5. mlx5_cmd_cleanup_async_ctx returns.
6. The caller of mlx5_cmd_cleanup_async_ctx frees the mlx5_async_ctx
object.
7. mlx5_cmd_exec_cb_handler goes on and calls wake_up() on the freed
object.
Fix it by syncing using a completion object. Mark it completed when
num_inflight reaches 0.
Trace:
BUG: KASAN: use-after-free in do_raw_spin_lock+0x23d/0x270
Read of size 4 at addr ffff888139cd12f4 by task swapper/5/0
CPU: 5 PID: 0 Comm: swapper/5 Not tainted 6.0.0-rc3_for_upstream_debug_2022_08_30_13_10 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl+0x57/0x7d
print_report.cold+0x2d5/0x684
? do_raw_spin_lock+0x23d/0x270
kasan_report+0xb1/0x1a0
? do_raw_spin_lock+0x23d/0x270
do_raw_spin_lock+0x23d/0x270
? rwlock_bug.part.0+0x90/0x90
? __delete_object+0xb8/0x100
? lock_downgrade+0x6e0/0x6e0
_raw_spin_lock_irqsave+0x43/0x60
? __wake_up_common_lock+0xb9/0x140
__wake_up_common_lock+0xb9/0x140
? __wake_up_common+0x650/0x650
? destroy_tis_callback+0x53/0x70 [mlx5_core]
? kasan_set_track+0x21/0x30
? destroy_tis_callback+0x53/0x70 [mlx5_core]
? kfree+0x1ba/0x520
? do_raw_spin_unlock+0x54/0x220
mlx5_cmd_exec_cb_handler+0x136/0x1a0 [mlx5_core]
? mlx5_cmd_cleanup_async_ctx+0x220/0x220 [mlx5_core]
? mlx5_cmd_cleanup_async_ctx+0x220/0x220 [mlx5_core]
mlx5_cmd_comp_handler+0x65a/0x12b0 [mlx5_core]
? dump_command+0xcc0/0xcc0 [mlx5_core]
? lockdep_hardirqs_on_prepare+0x400/0x400
? cmd_comp_notifier+0x7e/0xb0 [mlx5_core]
cmd_comp_notifier+0x7e/0xb0 [mlx5_core]
atomic_notifier_call_chain+0xd7/0x1d0
mlx5_eq_async_int+0x3ce/0xa20 [mlx5_core]
atomic_notifier_call_chain+0xd7/0x1d0
? irq_release+0x140/0x140 [mlx5_core]
irq_int_handler+0x19/0x30 [mlx5_core]
__handle_irq_event_percpu+0x1f2/0x620
handle_irq_event+0xb2/0x1d0
handle_edge_irq+0x21e/0xb00
__common_interrupt+0x79/0x1a0
common_interrupt+0x78/0xa0
</IRQ>
<TASK>
asm_common_interrupt+0x22/0x40
RIP: 0010:default_idle+0x42/0x60
Code: c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 04 84 d2 75 14 8b 05 eb 47 22 02 85 c0 7e 07 0f 00 2d e0 9f 48 00 fb f4 <c3> 48 c7 c7 80 08 7f 85 e8 d1 d3 3e fe eb de 66 66 2e 0f 1f 84 00
RSP: 0018:ffff888100dbfdf0 EFLAGS: 00000242
RAX: 0000000000000001 RBX: ffffffff84ecbd48 RCX: 1ffffffff0afe110
RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff835cc9bc
RBP: 0000000000000005 R08: 0000000000000001 R09: ffff88881dec4ac3
R10: ffffed1103bd8958 R11: 0000017d0ca571c9 R12: 0000000000000005
R13: ffffffff84f024e0 R14: 0000000000000000 R15: dffffc0000000000
? default_idle_call+0xcc/0x450
default_idle_call+0xec/0x450
do_idle+0x394/0x450
? arch_cpu_idle_exit+0x40/0x40
? do_idle+0x17/0x450
cpu_startup_entry+0x19/0x20
start_secondary+0x221/0x2b0
? set_cpu_sibling_map+0x2070/0x2070
secondary_startup_64_no_verify+0xcd/0xdb
</TASK>
Allocated by task 49502:
kasan_save_stack+0x1e/0x40
__kasan_kmalloc+0x81/0xa0
kvmalloc_node+0x48/0xe0
mlx5e_bulk_async_init+0x35/0x110 [mlx5_core]
mlx5e_tls_priv_tx_list_cleanup+0x84/0x3e0 [mlx5_core]
mlx5e_ktls_cleanup_tx+0x38f/0x760 [mlx5_core]
mlx5e_cleanup_nic_tx+0xa7/0x100 [mlx5_core]
mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core]
mlx5e_suspend+0xdb/0x140 [mlx5_core]
mlx5e_remove+0x89/0x190 [mlx5_core]
auxiliary_bus_remove+0x52/0x70
device_release_driver_internal+0x40f/0x650
driver_detach+0xc1/0x180
bus_remove_driver+0x125/0x2f0
auxiliary_driver_unregister+0x16/0x50
mlx5e_cleanup+0x26/0x30 [mlx5_core]
cleanup+0xc/0x4e [mlx5_core]
__x64_sys_delete_module+0x2b5/0x450
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Freed by task 49502:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_set_free_info+0x20/0x30
____kasan_slab_free+0x11d/0x1b0
kfree+0x1ba/0x520
mlx5e_tls_priv_tx_list_cleanup+0x2e7/0x3e0 [mlx5_core]
mlx5e_ktls_cleanup_tx+0x38f/0x760 [mlx5_core]
mlx5e_cleanup_nic_tx+0xa7/0x100 [mlx5_core]
mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core]
mlx5e_suspend+0xdb/0x140 [mlx5_core]
mlx5e_remove+0x89/0x190 [mlx5_core]
auxiliary_bus_remove+0x52/0x70
device_release_driver_internal+0x40f/0x650
driver_detach+0xc1/0x180
bus_remove_driver+0x125/0x2f0
auxiliary_driver_unregister+0x16/0x50
mlx5e_cleanup+0x26/0x30 [mlx5_core]
cleanup+0xc/0x4e [mlx5_core]
__x64_sys_delete_module+0x2b5/0x450
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Fixes: e355477ed9e4 ("net/mlx5: Make mlx5_cmd_exec_cb() a safe API")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-8-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mlx5 SQs must select the timestamp format explicitly according to the
active clock mode, select the current active timestamp mode so ASO SQ create
will succeed.
This fixes the following error prints when trying to create ipsec ASO SQ
while the timestamp format is real time mode.
mlx5_cmd_out_err:778:(pid 34874): CREATE_SQ(0x904) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xd61c0b), err(-22)
mlx5_aso_create_sq:285:(pid 34874): Failed to open aso wq sq, err=-22
mlx5e_ipsec_init:436:(pid 34874): IPSec initialization failed, -22
Fixes: cdd04f4d4d71 ("net/mlx5: Add support to create SQ and CQ for ASO")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reported-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-7-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently encap slow path rules just forward to software without
setting the chain id miss register, so driver doesn't restore
the chain, and packets hitting this rule will restart from tc chain
0 instead of continuing to the chain the encap rule was on.
Fix this by setting the chain id miss register to the chain id mapping.
Fixes: 8f1e0b97cc70 ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-6-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When tx_port_ts is set, the driver diverts all UPD traffic over PTP port
to a dedicated PTP-SQ. The SKBs are cached until the wire-CQE arrives.
When the packet size is greater then MTU, the firmware might drop it and
the packet won't be transmitted to the wire, hence the wire-CQE won't
reach the driver. In this case the SKBs are accumulated in the SKB fifo.
Add room check to consider the PTP-SQ SKB fifo, when the SKB fifo is
full, driver stops the queue resulting in a TX timeout. Devlink
TX-reporter can recover from it.
Fixes: 1880bc4e4a96 ("net/mlx5e: Add TX port timestamp support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-5-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When 2nd flow rules arrives, it will merge together with the
1st one if matcher criteria is the same.
If merge fails, driver will rollback the merge contents, and
reject the 2nd rule. At rollback stage, matcher can't be
disconnected unconditionally, otherise the 1st rule can't be
hit anymore.
Add logic to check if the matcher should be disconnected or not.
Fixes: cc2295cd54e4 ("net/mlx5: DR, Improve steering for empty or RX/TX-only matchers")
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-4-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After firmware reset driver should verify firmware already enabled CRS
and became responsive to pci config cycles before restoring pci state.
Fix that by waiting till device_id is readable through PCI again.
Fixes: eabe8e5e88f5 ("net/mlx5: Handle sync reset now event")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-3-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|