Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
mlx5-shared-2017-08-07
This series includes some mlx5 updates for both net-next and rdma trees.
From Saeed,
Core driver updates to allow selectively building the driver with
or without some large driver components, such as
- E-Switch (Ethernet SRIOV support).
- Multi-Physical Function Switch (MPFs) support.
For that we split E-Switch and MPFs functionalities into separate files.
From Erez,
Delay mlx5_core events when mlx5 interfaces, namely mlx5_ib, registration
is taking place and until it completes.
From Rabie,
Increase the maximum supported flow counters.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Get rid of struct tc_to_netdev which is now just unnecessary container
and rather pass per-type structures down to drivers directly.
Along with that, consolidate the naming of per-type structure variables
in cls_*.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As ndo_setup_tc is generic offload op for whole tc subsystem, does not
really make sense to have cls-specific args. So move them under
cls_common structurure which is embedded in all cls structs.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Allow to selectively build the driver with or without sriov eswitch, VF
representors and TC offloads.
Also remove the need of two ndo ops structures (sriov & basic)
and keep only one unified ndo ops, compile out VF SRIOV ndos when not
needed (MLX5_ESWITCH=n), and for VF netdev calling those ndos will result
in returning -EPERM.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Jes Sorensen <jsorensen@fb.com>
Cc: kernel-team@fb.com
|
|
When IP tunnel encapsulation rules are offloaded, the kernel can't see
the traffic of the offloaded flow. The neighbour for the IP tunnel
destination of the offloaded flow can mistakenly become STALE and
deleted by the kernel since its 'used' value wasn't changed.
To make sure that a neighbour which is used by the HW won't become
STALE, we proactively update the neighbour 'used' value every
DELAY_PROBE_TIME period, when packets were matched and counted by the HW
for one of the tunnel encap flows related to this neighbour.
The periodic task that updates the used neighbours is scheduled when a
tunnel encap rule is successfully offloaded into HW and keeps re-scheduling
itself as long as the representor's neighbours list isn't empty.
Add, remove, lookup and status change operations done over the
representor's neighbours list or the neighbour hash entry encaps list
are all serialized by RTNL lock.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
In order to offload TC encap rules, the driver does a lookup for the IP
tunnel neighbour according to the output device and the destination IP
given by the user.
To keep tracking after the validity state of such neighbours, we keep
the neighbours information (pair of device pointer and destination IP)
in a hash table maintained at the relevant egress representor and
register to get NETEVENT_NEIGH_UPDATE events. When getting neighbour update
netevent, we search for a match among the cached neighbours entries used for
encapsulation.
In case the neighbour isn't valid, we can't offload the flow into the
HW. We cache the flow (requested matching and actions) in the driver and
offload the rule later, when the neighbour is resolved and becomes
valid.
When a flow is only cached in the driver and not offloaded into HW
yet, we use EAGAIN return value to mark it internally, the TC ndo still
returns success.
Listen to kernel neighbour update netevents to trace relevant neighbours
validity state:
1. If a neighbour becomes valid, offload the related rules to HW.
2. If the neighbour becomes invalid, remove the related rules from HW.
3. If the neighbour mac address was changed, update the encap header.
Remove all the offloaded rules using the old encap header from the HW
and insert new rules to HW with updated encap header.
Access to the neighbors hash table is protected by RTNL lock of its
caller or by the table's spinlock.
Details of the locking/synchronization among the different actions
applied on the neighbour table:
Add/remove operations - protected by RTNL lock of its caller (all TC
commands are protected by RTNL lock). Add and remove operations are
initiated only when the user inserts/removes a TC rule into/from the driver.
Lookup/remove operations - since the lookup operation is done from
netevent notifier block, RTNL lock can't be used (atomic context).
Use the table's spin lock to protect lookups from TC user removal operation.
bh is used since netevent can be called from a softirq context.
Lookup/add operations - The hash table access functions are taking
care of the protection between lookup and add operations.
When adding/removing encap headers and rules to/from the HW, RTNL lock
is used. It can happen when:
1. The user inserts/removes a TC rule into/from the driver (TC commands
are protected by RTNL lock of it's caller).
2. The driver gets neighbour notification event, which reports about
neighbour validity status change. Before adding/removing encap headers
and rules to/from the HW, RTNL lock is taken.
A neighbour hash table entry should be freed when its encap list is empty.
Since The neighbour update netevent notification schedules a neighbour
update work that uses the neighbour hash entry, it can't be freed
unconditionally when the encap list becomes empty during TC delete rule flow.
Use reference count to protect from freeing neighbour hash table entry
while it's still in use.
When the user asks to unregister a netdvice used by one of the neigbours,
neighbour removal notification is received. Then we take a reference on the
neighbour and don't free it until the relevant encap entries (and flows) are
marked as invalid (not offloaded) and removed from HW.
As long as the encap entry is still valid (checked under RTNL lock) we
can safely access the neighbour device saved on mlx5e_neigh struct.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Introduce support in updating statistics of offloaded TC flower
classifiers. Currently only the DROP action is supported.
Signed-off-by: Amir Vadai <amirva@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Slightly refactor and re-order the flow steering structs,
tables and data-bases for better self-containment and
flexibility to add more future steering phases
(tables/rules/data bases) e.g: aRFS.
Changes:
1. Move the vlan DB and address DB into their table structs.
2. Rename steering table structs to unique format: mlx5e_*_table,
e.g: mlx5e_vlan_table.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Introduce offloading of skbedit mark action.
For example, to mark with 0x1234, all TCP (ip_proto 6) packets arriving
to interface ens9:
# tc qdisc add dev ens9 ingress
# tc filter add dev ens9 protocol ip parent ffff: \
flower ip_proto 6 \
indev ens9 \
action skbedit mark 0x1234
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Parse tc_cls_flower_offload into device specific commands and program
the hardware to classify and act accordingly.
For example, to drop ICMP (ip_proto 1) packets from specific smac, dmac,
src_ip, src_ip, arriving to interface ens9:
# tc qdisc add dev ens9 ingress
# tc filter add dev ens9 protocol ip parent ffff: \
flower ip_proto 1 \
dst_mac 7c:fe:90:69:81:62 src_mac 7c:fe:90:69:81:56 \
dst_ip 11.11.11.11 src_ip 11.11.11.12 indev ens9 \
action drop
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extend ndo_setup_tc() to support ingress tc offloading. Will be used by
later patches to offload tc flower filter.
Feature is off by default and could be enabled by issuing:
# ethtool -K eth0 hw-tc-offload on
Offloads flow table is dynamically created when first filter is
added.
Rules are saved in a hash table that is maintained by the consumer (for
example - the flower offload in the next patch).
When last filter is removed and no filters exist in the hash table, the
offload flow table is destroyed.
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|