Age | Commit message (Collapse) | Author | Files | Lines |
|
The MSCC bug fix in 'net' had to be slightly adjusted because the
register accesses are done slightly differently in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On sq closure when we free its descriptors, we should also update netdev
txq on completions which would not arrive. Otherwise if we reopen sqs
and attach them back, for example on fw fatal recovery flow, we may get
tx timeout.
Fixes: 29429f3300a3 ("net/mlx5e: Timeout if SQ doesn't flush during close")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
After previous modifications, the offloads are no longer called one by
one, the pi is calculated and the wqe is cleared on between of TLS and
IPSEC offloads, which doesn't quite fit mlx5e_accel_handle_tx's purpose.
This patch splits mlx5e_accel_handle_tx into two functions that
correspond to two logical phases of running offloads:
1. Before fetching a WQE. Here runs the code that can post WQEs on its
own, before the main WQE is fetched. It's the main part of TLS offload.
2. After fetching a WQE. Here runs the code that updates the WQE's
fields, but can't post other WQEs any more. It's a minor part of TLS
offload that sets the tisn field in the cseg, and eseg-based offloads
(currently IPSEC, and later patches will move GENEVE and checksum
offloads there, too).
It allows to make mlx5e_xmit take care of all actions needed to transmit
a packet in the right order, improve the structure of the code and
reduce unnecessary operations. The structure will be further improved in
the following patches (all eseg-based offloads will be moved to a single
place, and reserving space for the main WQE will happen between phase 1
and phase 2 of offloads to eliminate unneeded data movements).
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
mlx5e_sq_xmit and mlx5i_sq_xmit always return NETDEV_TX_OK. Drop the
return value to simplify the code.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
TLS and IPSEC offloads currently return struct sk_buff *, but the value
is either NULL or the same skb that was passed as a parameter. Return
bool instead to provide stronger guarantees to the calling code (it
won't need to support handling a different SKB that could be potentially
returned before this change) and to simplify restructuring this code in
the following commits.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
In our fast-path design, a WQE (Work Queue Element) must not cross the
page boundary. To enforce that, for WQEs consisting of more than one BB
(Basic Block), the driver checks the available contiguous space in the
WQ in advance, and if it's not enough, it pads it with NOPs.
This patch modifies the code that calculates the position of next WQE,
considering the padding, and prepares the WQE. This code is common for
all SQ types. In this patch it's reorganized in a way that makes the
usage pattern unified for all SQ types, and makes the implementations
self-contained and look almost the same, preparing the repeating code to
further attempts to deduplicate it.
One place is left as is: mlx5e_sq_xmit and mlx5e_fill_sq_frag_edge call
inside, because it is special in a way that it may also copy WQE's cseg
and eseg when reserving space. This will be eliminated in one of the
following patches, and this place will be converted to the new approach,
too.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
There are multiple functions mlx5{e,i}_*_fetch_wqe that contain the same
code, that is repeated, because they operate on different SQ struct
types. mlx5e_sq_fetch_wqe also returns void *, instead of the concrete
WQE type.
This commit generalizes the fetch WQE operation by putting this code
into a single function. To simplify calls of the generic function in
concrete use cases, macros are provided that substitute the right WQE
size and cast the return type.
Before this patch, fetch_wqe used to calculate pi itself, but the value
was often known to the caller. This calculation is moved outside to
eliminate this unnecessary step and prepare for the fill_frag_edge
refactoring in the next patch.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Error CQE was dumped only for TXQ SQs.
Generalise the function, and add usage for error completions
on ICO SQs and XDP SQs.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
We shall always extract channel index out of the txq, regardless
of the relation between txq_ix and num channels. The extraction is
always valid, as if txq is smaller than number of channels,
txq_ix == priv->txq2sq[txq_ix]->ch_ix.
By doing so, we can remove an if clause from the select queue method,
and have one flow for all packets.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
For a cyclic work queue, when not requesting a completion per WQE,
a single CQE might indicate the completion of several WQEs.
However, in case some WQE in the batch causes an error, then an error
completion is issued, breaking the batch, and pointing to the offending
WQE in the wqe_counter field.
Hence, WQE-specific error CQE handling (like printing, breaking, etc...)
should be performed only for the last WQE in batch.
Fixes: 130c7b46c93d ("net/mlx5e: TX, Dump WQs wqe descriptors on CQE with error events")
Fixes: fd9b4be8002c ("net/mlx5e: RX, Support multiple outstanding UMR posts")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Cited patch changed (channel index, tc) => (TXQ index) mapping to be a
static one, in order to keep indices consistent when changing number of
channels or TCs.
For 32 channels (OOB) and 8 TCs, real num of TXQs is 256.
When reducing the amount of channels to 8, the real num of TXQs will be
changed to 64.
This indices method is buggy:
- Channel #0, TC 3, the TXQ index is 96.
- Index 8 is not valid, as there is no such TXQ from driver perspective
(As it represents channel #8, TC 0, which is not valid with the above
configuration).
As part of driver's select queue, it calls netdev_pick_tx which returns an
index in the range of real number of TXQs. Depends on the return value,
with the examples above, driver could have returned index larger than the
real number of tx queues, or crash the kernel as it tries to read invalid
address of SQ which was not allocated.
Fix that by allocating sequential TXQ indices, and hold a new mapping
between (channel index, tc) => (real TXQ index). This mapping will be
updated as part of priv channels activation, and is used in
mlx5e_select_queue to find the selected queue index.
The existing indices mapping (channel_tc2txq) is no longer needed, as it
is used only for statistics structures and can be calculated on run time.
Delete its definintion and updates.
Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2019-11-01
Misc updates for mlx5 netdev and core driver
1) Steering Core: Replace CRC32 internal implementation with standard
kernel lib.
2) Steering Core: Support IPv4 and IPv6 mixed matcher.
3) Steering Core: Lockless FTE read lookups
4) TC: Bit sized fields rewrite support.
5) FPGA: Standalone FPGA support.
6) SRIOV: Reset VF parameters configurations on SRIOV disable.
7) netdev: Dump WQs wqe descriptors on CQE with error events.
8) MISC Cleanups.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Dump the Work Queue's TX WQE descriptor when a completion with
error is received.
Example:
[5.331832] mlx5_core 0000:00:04.0 enp0s4: Error cqe on cqn 0xa, ci 0x1, TXQ-SQ qpn 0xe, opcode 0xd, syndrome 0x2, vendor syndrome 0x0
[5.333127] 00000000: 55 65 02 75 31 fe c2 d2 6b 6c 62 1e f9 e1 d8 5c
[5.333837] 00000010: d3 b2 6c b8 89 e4 84 20 0b f4 3c e0 f3 75 41 ca
[5.334568] 00000020: 46 00 00 00 cd 70 a0 92 18 3a 01 de 00 00 00 00
[5.335313] 00000030: 7d bc 05 89 b2 e9 00 02 1e 00 00 0e 00 00 30 d2
[5.335972] WQE DUMP: WQ size 1024 WQ cur size 0, WQE index 0x0, len: 64
[5.336710] 00000000: 00 00 00 1e 00 00 0e 04 00 00 00 08 00 00 00 00
[5.337524] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 12 33 33
[5.338151] 00000020: 00 00 00 16 52 54 00 00 00 01 86 dd 60 00 00 00
[5.338740] 00000030: 00 00 00 48 00 00 00 00 00 00 00 00 66 ba 58 14
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The completion queue consumer index increments upon a call to
mlx5_cqwq_pop().
When dumping an error CQE, the index is already incremented.
Decrease one for the print command.
Fixes: 16cc14d81733 ("net/mlx5e: Dump xmit error completions")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
A call to kTLS completion handler was missing in the TXQSQ release
flow. Add it.
Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Cited patch removed the assumption only in datapath.
Here we remove it also form control/cleanup flow.
Fixes: 9ab0233728ca ("net/mlx5e: Tx, Don't implicitly assume SKB-less wqe has one WQEBB")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
If capable, use zero inline mode in TX WQE for non-VLAN packets.
For VLAN ones, keep the enforcement of at least L2 inline mode,
unless the WQE VLAN insertion offload cap is on.
Performance:
Tested single core packet rate of 64Bytes.
NIC: ConnectX-5
CPU: Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
pktgen:
Before: 12.46 Mpps
After: 14.65 Mpps (+17.5%)
XDP_TX:
The MPWQE flow is not affected, as it already has this optimization.
So we test with priv-flag xdp_tx_mpwqe: off.
Before: 9.90 Mpps
After: 10.20 Mpps (+3%)
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Tested-by: Noam Stolero <noams@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
In preparation for unifying the skb_frag and bio_vec, use the fine
accessors which already exist and use skb_frag_t instead of
struct skb_frag_struct.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for transmit side kernel-TLS acceleration.
Offload the crypto encryption to HW.
Per TLS connection:
- Use a separate TIS to maintain the HW context.
- Use a separate encryption key.
- Maintain static and progress HW contexts by posting the proper
WQEs at creation time, or upon resync.
- Use a special DUMP opcode to replay the previous frags and sync
the HW context.
To make sure the SQ is able to serve an xmit request, increase
SQ stop room to cover:
- static params WQE,
- progress params WQE, and
- resync DUMP per frag.
Currently supporting TLS 1.2, and key size 128bit.
Tested over SimX simulator.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use an SQ field for stop_room, and use the larger value only if TLS
is supported.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When polling a CQE of an SKB-less WQE, don't assume it consumed only
one WQEBB. Use wi->num_wqebbs directly instead.
In the downstream patch, SKB-less WQEs might have more the one WQEBB,
thus this change is needed.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Change mlx5e_sq_fetch_wqe to be agnostic to the Work Queue
Element (WQE) type.
Before this patch, it was specific for struct mlx5e_tx_wqe.
In order to allow the change, the function now returns the
generic void pointer, and gets the WQE size to do the zero
memset.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When ctrl->tisn field exists, this indicates an operation (HW offload)
on the TCP payload.
For such WQEs, inline the headers up to L4.
This is in preparation for kTLS HW offload support, added in
a downstream patch.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Take datapath helper functions to a new header file en/txrx.h.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Honestly all the conflicts were simple overlapping changes,
nothing really interesting to report.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The TX queue index returned by the fallback function ranges
between [0,NUM CHANNELS - 1] if QoS isn't set and
[0, (NUM CHANNELS)*(NUM TCs) -1] otherwise.
Our HW uses different TC mapping than the fallback function
(which is denoted as 'up', user priority) so we only need to extract
a channel number out of the returned value.
Since (NUM CHANNELS)*(NUM TCs) is a relatively small number, using
reciprocal scale almost always returns zero.
We instead access the 'txq2sq' table to extract the sq (and with it the
channel number) associated with the tx queue, thus getting
a more evenly distributed channel number.
Perf:
Rx/Tx side with Intel(R) Xeon(R) Silver 4108 CPU @ 1.80GHz and ConnectX-5.
Used 'iperf' UDP traffic, 10 threads, and priority 5.
Before: 0.566Mpps
After: 2.37Mpps
As expected, releasing the existing bottleneck of steering all traffic
to TX queue zero significantly improves transmission rates.
Fixes: 7ccdd0841b30 ("net/mlx5e: Fix select queue callback")
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
__netdev_tx_sent_queue() was introduced by:
commit 3e59020abf0f ("net: bql: add __netdev_tx_sent_queue()")
BQL counters should be updated without flipping/caring about
BQL status, if the current skb has xmit_more set.
Using __netdev_tx_sent_queue() avoids messing with BQL stop
flag, increases performance on GSO workload by keeping
doorbells to the minimum required and also sparing atomic
operations.
Signed-off-by: Erez Alfasi <ereza@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Cited patch refactored the xmit_more indication while not preserving
its functionality. Fix it.
Fixes: 3c31ff22b25f ("drivers: mellanox: use netdev_xmit_more() helper")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Update all users of eth_get_headlen to pass network device, fetch
network namespace from it and pass it down to the flow dissector.
This commit is a noop until administrator inserts BPF flow dissector
program.
Cc: Maxim Krasnyansky <maxk@qti.qualcomm.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
skb->xmit_more hint is now always 0. This switches the mellanox drivers
to the netdev_xmit_more() helper.
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Boris Pismenny <borisp@mellanox.com>
Cc: Ilya Lesokhin <ilyal@mellanox.com>
Cc: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently support only default geneve udp port (6081).
For the tx side, the HW is assisted by SW parsing, which sets the
headers offset to offload tunneled LSO and csum. Note that for udp
tunnels, we don't use special rx offloads, as rss on the outer headers
is enough, we support checksum complete and GRO takes care of
aggregation.
Geneve TSO BW and CPU load results (tested using iperf single tcp
stream).
In this patch we add TSO support over Geneve, so the "before" result
doesn't actually get to using the TSO HW offload even when turned on.
Tested on ConnectX-5, Intel(R) Xeon(R) CPU E5-2660 v2 @2.20GHz.
__________________________________
| Before | After |
|________________|_________________|
| 12.6 Gbits/sec | 21.7 Gbits/sec |
| 100% CPU load | 61.5% CPU load |
|________________|_________________|
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
After the previous patch, all the callers of ndo_select_queue()
provide as a 'fallback' argument netdev_pick_tx.
The only exceptions are nested calls to ndo_select_queue(),
which pass down the 'fallback' available in the current scope
- still netdev_pick_tx.
We can drop such argument and replace fallback() invocation with
netdev_pick_tx(). This avoids an indirect call per xmit packet
in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen)
with device drivers implementing such ndo. It also clean the code
a bit.
Tested with ixgbe and CONFIG_FCOE=m
With pktgen using queue xmit:
threads vanilla patched
(kpps) (kpps)
1 2334 2428
2 4166 4278
4 7895 8100
v1 -> v2:
- rebased after helper's name change
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2019-02-21
This series adds some misc updates to mlx5 driver,
1) Eli Britstein, Introduces tunnel entropy control from PCMR register
and fixes GRE key by controlling port tunnel entropy calculation.
2) Eran Ben Elisha, provides some mlx5 fixes to the latest tx devlink health
reporting mechanism.
3) Huy Nguyen, Added the support for ndo bridge_setlink to allow
VEPA/VEB E-Switch legacy mode configurations.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When TX reporter was introduced, it took ownership over TX timeout error
handling. this introduced a regression in case TX reporter is not valid
(NET_DEVLINK is not set, or devlink_health_reporter_create failure).
Fix mlx5e_tx_reporter_timeout function so it can be called at all times.
In addition, remove a warning print that indicates that a TX timeout won't
be handled in case of no valid TX reporter.
Fixes: 7d91126b1aea ("net/mlx5e: Add tx timeout support for mlx5e tx reporter")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
After AF_PACKET is fixed to calculate the transport header offset
correctly, trust the value set by the kernel. If the offset wasn't set,
it means there is no transport header in the packet.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
skb_transport_offset() == 0 is not a special value. The only special
value is when skb->transport_header is ~0U, and it's checked by
skb_transport_header_was_set().
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
An ipvlan bug fix in 'net' conflicted with the abstraction away
of the IPV6 specific support in 'net-next'.
Similarly, a bug fix for mlx5 in 'net' conflicted with the flow
action conversion in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add mlx5e tx reporter to devlink health reporters. This reporter will be
responsible for diagnosing, reporting and recovering of tx errors.
This patch declares the TX reporter operations and creates it using the
devlink health API. Currently, this reporter supports reporting and
recovering from send error CQE only. In addition, it adds diagnose
information for the open SQs.
For a local SQ recover (due to driver error report), in case of SQ recover
failure, the recover operation will be considered as a failure.
For a full tx recover, an attempt to close and open the channels will be
done. If this one passed successfully, it will be considered as a
successful recover.
The SQ recover from error CQE flow is not a new feature in the driver,
this patch re-organize the functions and adapt them for the devlink
health API. For this purpose, move code from en_main.c to a new file
named reporter_tx.c.
Diagnose output:
$devlink health diagnose pci/0000:00:09.0 reporter tx -j -p
{
"SQs": [ {
"sqn": 138,
"HW state": 1,
"stopped": false
},{
"sqn": 142,
"HW state": 1,
"stopped": false
} ]
}
$devlink health diagnose pci/0000:00:09.0 reporter tx
SQs:
sqn: 138 HW state: 1 stopped: false
sqn: 142 HW state: 1 stopped: false
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
At Innova IPsec TX offload data path a special software parser metadata
is used to pass some packet attributes to the hardware, this metadata
is passed using the Ethernet control segment of a WQE (a HW descriptor)
header.
The cited commit might nullify this header, hence the metadata is lost,
this caused a significant performance drop during hw offloading
operation.
Fix by restoring the metadata at the Ethernet control segment in case
it was nullified.
Fixes: 37fdffb217a4 ("net/mlx5: WQ, fixes for fragmented WQ buffers API")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
This reverts the devlink health changes from 9/17/2019,
Jiri wants things to be designed differently and it was
agreed that the easiest way to do this is start from the
beginning again.
Commits reverted:
cb5ccfbe73b389470e1dc11061bb185ef4bc9aec
880ee82f0313453ec5a6cb122866ac057263066b
c7af343b4e33578b7de91786a3f639c8cfa0d97b
ff253fedab961b22117a73ab808fcfa9e6852b50
6f9d56132eb6d2603d4273cfc65bed914ec47acb
fcd852c69d776c0f46c8f79e8e431e5cc6ddc7b7
8a66704a13d9713593342e29b4f0c19762f5746b
12bd0dcefe88782ac1c9fff632958dd1b71d27e5
aba25279c10094c5c97d09c3491ca86d00b4ad5e
ce019faa70f81555fa17ebc1d5a03651f2e7e15a
b8c45a033acc607201588f7665ba84207e5149e0
And the follow-on build fix:
o33a0efa4baecd689da9474ce0e8b673eb6931c60
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add mlx5e tx reporter to devlink health reporters. This reporter will be
responsible for diagnosing, reporting and recovering of TX errors.
This patch declares the TX reporter operations and allocate it using the
devlink health API. Currently, this reporter supports reporting and
recovering from send error CQE only. In addition, it adds diagnose
information for the open SQs.
For a local SQ recover (due to driver error report), in case of SQ recover
failure, the recover operation will be considered as a failure.
For a full TX recover, an attempt to close and open the channels will be
done. If this one passed successfully, it will be considered as a
successful recover.
The SQ recover from error CQE flow is not a new feature in the driver,
this patch re-organize the functions and adapt them for the devlink
health API. For this purpose, move code from en_main.c to a new file
named reporter_tx.c.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The opcode indicates about the error reason.
Printing it helps in debug.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
mlx5-next shared branch with rdma subtree to avoid mlx5 rdma v.s. netdev
conflicts.
Highlights:
1) RDMA ODP (On Demand Paging) improvements and moving ODP logic to
mlx5 RDMA driver
2) Improved mlx5 core driver and device events handling and provided API
for upper layers to subscribe to device events.
3) RDMA only code cleanup from mlx5 core
4) Add helper to get CQE opcode
5) Rework handling of port module events
6) shared mlx5_ifc.h updates to avoid conflicts
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Introduce and use a helper that extracts the opcode
from a CQE (completion queue entry) structure.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
mlx5e netdevice used to calculate fragment edges by a call to
mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct
indication for queues smaller than a PAGE_SIZE, (broken by default on
PowerPC, where PAGE_SIZE == 64KB). Here it is replaced by the correct new
calls/API.
Since (TX/RX) Work Queues buffers are fragmented, here we introduce
changes to the API in core driver, so that it gets a stride index and
returns the index of last stride on same fragment, and an additional
wrapping function that returns the number of physically contiguous
strides that can be written contiguously to the work queue.
This obsoletes the following API functions, and their buggy
usage in EN driver:
* mlx5_wq_cyc_get_frag_size()
* mlx5_wq_cyc_ctr2fragix()
The new API improves modularity and hides the details of such
calculation for mlx5e netdevice and mlx5_ib rdma drivers.
New calculation is also more efficient, and improves performance
as follows:
Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size.
Before: 16,477,619 pps
After: 17,085,793 pps
3.7% improvement
Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Use designated function mlx5e_dma_get() to get
the mlx5e_sq_dma object to be pushed into fifo.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
For most of these calls we can just pass NULL through to the fallback
function as the sb_dev. The only cases where we cannot are the cases where
we might be dealing with either an upper device or a driver that would
have configured things to support an sb_dev itself.
The only driver that has any significant change in this patch set should be
ixgbe as we can drop the redundant functionality that existed in both the
ndo_select_queue function and the fallback function that was passed through
to us.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
This patch makes it so that instead of passing a void pointer as the
accel_priv we instead pass a net_device pointer as sb_dev. Making this
change allows us to pass the subordinate device through to the fallback
function eventually so that we can keep the actual code in the
ndo_select_queue call as focused on possible on the exception cases.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Add per-ring and global ethtool counters for TX completions.
This helps us monitor and analyze TX flow performance.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|