summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
AgeCommit message (Collapse)AuthorFilesLines
2020-05-27net/mlx5: E-Switch, Alloc and free unique metadata for matchVu Pham1-32/+64
Introduce infrastructure to create unique metadata for match for vport without depending on vport_num. Vport uses its default metadata for match in standalone configuration but will share a different unique "bond_metadata" for match with other vports in bond configuration. Using ida to generate unique metadata for match for vports in default and bond configurations. Introduce APIs to generate, free metadata for match. Introduce APIs to set vport's bond_metadata and replace its ingress acl rules with bond_metatada. Signed-off-by: Vu Pham <vuhuong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27net/mlx5: E-Switch, Refactor eswitch ingress acl codesVu Pham1-260/+9
Restructure the eswitch ingress acl codes into eswitch directory and different files: . Acl ingress helper functions to acl_helper.c/h . Acl ingress functions used in offloads mode to acl_ingress_ofld.c . Acl ingress functions used in legacy mode to acl_ingress_lgy.c This patch does not change any functionality. Signed-off-by: Vu Pham <vuhuong@mellanox.com>
2020-05-27net/mlx5: E-Switch, Refactor eswitch egress acl codesVu Pham1-33/+3
Refactor the egress acl codes so that offloads and legacy modes can configure specifically their own needs of egress acl table, groups and rules. While at it, restructure the eswitch egress acl codes into eswitch directory and different files: . Acl egress helper functions to acl_helper.c/h . Acl egress functions used in offloads mode to acl_egress_ofld.c . Acl egress functions used in legacy mode to acl_egress_lgy.c This patch does not change any functionality. Signed-off-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5e: Add support for hw decapsulation of MPLS over UDPEli Cohen1-0/+4
MPLS over UDP is supported in hardware by using a packet reformat object with reformat type equal L3_TUNNEL_TO_L2 which both decapsulates the outer L3, L4 and MPLS headers, and allows for setting the L2 headers of the resulting decapsulated packet. For the hardware to operate correctly, the configuration of the firmware must have FLEX_PARSER_PROFILE_ENABLE = 1. Example tc rule: tc filter add dev bareudp0 protocol all prio 1 root flower enc_dst_port \ 6635 enc_src_ip 8.8.8.23 action mpls pop protocol ip pipe \ action pedit ex munge eth dst set 00:11:22:33:44:21 pipe action \ mirred egress redirect dev enp59s0f0_0 We use pedit to set the correct destination MAC. For MPLS over UDP decapsulation to take place, the driver logic requires the following: 1. flower filter added on bareudp device. 2. action mpls pop 3. zero or more pedit munge actions 4. one redirect action Current implementation supports only IPv4 and no VLAN. tc filter show output looks like this: filter protocol all pref 1 flower chain 0 filter protocol all pref 1 flower chain 0 handle 0x1 enc_src_ip 8.8.8.24 enc_dst_port 6635 in_hw in_hw_count 1 action order 1: mpls pop protocol ip pipe index 2 ref 1 bind 1 action order 2: pedit action pipe keys 2 index 1 ref 1 bind 1 key #0 at eth+0: val 00112233 mask 00000000 key #1 at eth+4: val 44210000 mask 0000ffff action order 3: mirred (Egress Redirect to device enp59s0f0_0) stolen index 2 ref 1 bind 1 Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22net/mlx5e: Introduce kconfig var for TC supportVlad Buslov1-0/+2
In order to improve code maintainability and readability, introduce new CONFIG_MLX5_CLS_ACT kconfig variable to control compilation of TC hardware offloads implementation. This allows distinguishing between features that require TC support (MPLSoUDP, etc.) and features that just rely on representor functionality (rep_bond for live migration, etc.). Modify rep_tc.h, rep_neigh.h, en_tc.h and chains.h files to provide stubs for functions that are called from generic code. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-9/+9
Conflicts were all overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30Merge branch 'mlx5-next' of ↵Saeed Mahameed1-11/+11
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5 updates for both net-next and rdma-next: 1) HW bits and definitions for TLS and IPsec offlaods 2) Release all pages capability bits 3) New command interface helpers and some code cleanup as a result 4) Move qp.c out of mlx5 core driver into mlx5_ib rdma driver Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5: E-switch, Fix mutex init orderParav Pandit1-6/+6
In cited patch mutex is initialized after its used. Below call trace is observed. Fix the order to initialize the mutex early enough. Similarly follow mirror sequence during cleanup. kernel: DEBUG_LOCKS_WARN_ON(lock->magic != lock) kernel: WARNING: CPU: 5 PID: 45916 at kernel/locking/mutex.c:938 __mutex_lock+0x7d6/0x8a0 kernel: Call Trace: kernel: ? esw_vport_tbl_get+0x3b/0x250 [mlx5_core] kernel: ? mark_held_locks+0x55/0x70 kernel: ? __slab_free+0x274/0x400 kernel: ? lockdep_hardirqs_on+0x140/0x1d0 kernel: esw_vport_tbl_get+0x3b/0x250 [mlx5_core] kernel: ? mlx5_esw_chains_create_fdb_prio+0xa57/0xc20 [mlx5_core] kernel: mlx5_esw_vport_tbl_get+0x88/0xf0 [mlx5_core] kernel: mlx5_esw_chains_create+0x2f3/0x3e0 [mlx5_core] kernel: esw_create_offloads_fdb_tables+0x11d/0x580 [mlx5_core] kernel: esw_offloads_enable+0x26d/0x540 [mlx5_core] kernel: mlx5_eswitch_enable_locked+0x155/0x860 [mlx5_core] kernel: mlx5_devlink_eswitch_mode_set+0x1af/0x320 [mlx5_core] kernel: devlink_nl_cmd_eswitch_set_doit+0x41/0xb0 Fixes: 96e326878fa5 ("net/mlx5e: Eswitch, Use per vport tables for mirroring") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5: E-switch, Fix printing wrong error valueParav Pandit1-1/+1
When mlx5_modify_header_alloc() fails, instead of printing the error value returned, current error log prints 0. Fix by printing correct error value returned by mlx5_modify_header_alloc(). Fixes: 6724e66b90ee ("net/mlx5: E-Switch, Get reg_c1 value on miss") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5: E-switch, Fix error unwinding flow for steering init failureParav Pandit1-2/+2
Error unwinding is done incorrectly in the cited commit. When steering init fails, there is no need to perform steering cleanup. When vport error exists, error cleanup should be mirror of the setup routine, i.e. to perform steering cleanup before metadata cleanup. This avoids the call trace in accessing uninitialized objects which are skipped during steering_init() due to failure in steering_init(). Call trace: mlx5_cmd_modify_header_alloc:805:(pid 21128): too many modify header actions 1, max supported 0 E-Switch: Failed to create restore mod header BUG: kernel NULL pointer dereference, address: 00000000000000d0 [ 677.263079] mlx5_destroy_flow_group+0x13/0x80 [mlx5_core] [ 677.268921] esw_offloads_steering_cleanup+0x51/0xf0 [mlx5_core] [ 677.275281] esw_offloads_enable+0x1a5/0x800 [mlx5_core] [ 677.280949] mlx5_eswitch_enable_locked+0x155/0x860 [mlx5_core] [ 677.287227] mlx5_devlink_eswitch_mode_set+0x1af/0x320 [ 677.293741] devlink_nl_cmd_eswitch_set_doit+0x41/0xb0 [ 677.299217] genl_rcv_msg+0x1eb/0x430 Fixes: 7983a675ba65 ("net/mlx5: E-Switch, Enable chains only if regs loopback is enabled") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-28net/mlx5: Add support for COPY steering actionHuy Nguyen1-2/+2
Add COPY type to modify_header action. IPsec feature is the first feature that needs COPY steering action. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23net/mlx5: Update eswitch to new cmd interfaceLeon Romanovsky1-9/+9
Do mass update of eswitch to reuse newly introduced mlx5_cmd_exec_in*() interfaces. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-08net/mlx5: Fix condition for termination table cleanupEli Cohen1-9/+3
When we destroy rules from slow path we need to avoid destroying termination tables since termination tables are never created in slow path. By doing so we avoid destroying the termination table created for the slow path. Fixes: d8a2034f152a ("net/mlx5: Don't use termination tables in slow path") Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-29net/mlx5: E-Switch: Move eswitch chains to a new directorySaeed Mahameed1-1/+1
eswitch_offloads_chains.{c,h} were just introduced this kernel release cycle, eswitch is in high development demand right now and many features are planned to be added to it. eswitch deserves its own directory and here we move these new files to there, in preparation for upcoming eswitch features and new files. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-03-25net/mlx5: E-switch, Protect eswitch mode changesParav Pandit1-37/+68
Currently eswitch mode change is occurring from 2 different execution contexts as below. 1. sriov sysfs enable/disable 2. devlink eswitch set commands Both of them need to access eswitch related data structures in synchronized manner. Without any synchronization below race condition exist. SR-IOV enable/disable with devlink eswitch mode change: cpu-0 cpu-1 ----- ----- mlx5_device_disable_sriov() mlx5_devlink_eswitch_mode_set() mlx5_eswitch_disable() esw_offloads_stop() esw_offloads_disable() mlx5_eswitch_disable() esw_offloads_disable() Hence, they are synchronized using a new mode_lock. eswitch's state_lock is not used as it can lead to a deadlock scenario below and state_lock is only for vport and fdb exclusive access. ip link set vf <param> netlink rcv_msg() - Lock A rtnl_lock vfinfo() esw->state_lock() - Lock B devlink eswitch_set devlink_mutex esw->state_lock() - Lock B attach_netdev() register_netdev() rtnl_lock - Lock A Alternatives considered: 1. Acquiring rtnl lock before taking esw->state_lock to follow similar locking sequence as ip link flow during eswitch mode set. rtnl lock is not good idea for two reasons. (a) Holding rtnl lock for several hundred device commands is not good idea. (b) It leads to below and more similar deadlocks. devlink eswitch_set devlink_mutex rtnl_lock - Lock A esw->state_lock() - Lock B eswitch_disable() reload() ib_register_device() ib_cache_setup_one() rtnl_lock() 2. Exporting devlink lock may lead to undesired use of it in vendor driver(s) in future. 3. Unloading representors outside of the mode_lock requires serialization with other process trying to enable the eswitch. 4. Differing the representors life cycle to a different workqueue requires synchronization with func_change_handler workqueue. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: E-switch, Extend eswitch enable to handle num_vfs changeParav Pandit1-5/+8
Subsequent patch protects eswitch mode changes across sriov and devlink interfaces. It is desirable for eswitch to provide thread safe eswitch enable and disable APIs. Hence, extend eswitch enable API to optionally update num_vfs when requested. In subsequent patch, eswitch num_vfs are updated after all the eswitch users eswitch drops its reference count. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: Split eswitch mode check to different helper functionParav Pandit1-4/+33
In order to check eswitch state under a lock, prepare code to split capability check and eswitch state check into two helper functions. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: E-Switch, free flow_group_in after creating the restore tableRoi Dayan1-0/+2
We allocate a temporary memory but forget to free it. Fixes: 11b717d61526 ("net/mlx5: E-Switch, Get reg_c0 value on CQE") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: E-Switch, Enable chains only if regs loopback is enabledPaul Blakey1-5/+6
Register c0 loopback is needed to fully support chains and prios. Enable chains and prio only if loopback (of reg c1 which came together with c0), is enabled. To be able to check that, move enabling of loopback before eswitch chains init. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: E-Switch, Enable restore table only if reg_c1 is supportedPaul Blakey1-0/+9
Reg c0/c1 matching, rewrite of regs c0/c1, and copy header of regs c1,B is needed for the restore table to function, might not be supported by firmware, and creation of the restore table or the copy header will fail. Check reg_c1 loopback support, as firmware which supports this, should have all of the above. Fixes: 11b717d61526 ("net/mlx5: E-Switch, Get reg_c0 value on CQE") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5: Don't use termination tables in slow pathEli Cohen1-6/+14
Don't use termination tables for packets that are steered to the slow path, as a pre-step for supporting packet encap (packet reformat) action on termination tables. Packet encap (reformat action) actions steer the packet to the slow path until outer arp entries are resolved. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-13net/mlx5: Avoid deriving mlx5_core_dev second timeParav Pandit1-9/+7
All callers needs to work on mlx5_core_dev and it is already derived before calling mlx5_devlink_eswitch_check(). Hence, accept mlx5_core_dev in mlx5_devlink_eswitch_check(). Given that it works on mlx5_core_dev change helper function name to drop devlink prefix. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-13net/mlx5: E-switch, Annotate termtbl_mutex mutex destroyParav Pandit1-1/+3
Annotate mutex destroy to keep it symmetric to init sequence. It should be destroyed after its users (representor netdevices) are destroyed in below flow. esw_offloads_disable() esw_offloads_unload_rep() Hence, initialize the mutex before creating the representors which uses it. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-13net/mlx5: Accept flow rules without matchMark Bloch1-2/+1
Allow passing NULL spec when creating a flow rule. Such rules will act as "catch all" flow rules. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-13net/mlx5: E-Switch, Refactor unload all reps per rep typeBodong Wang1-19/+5
Following introduction of per vport configuration of vport and rep, unload all reps per rep type is still needed as IB reps can be unloaded individually. However, a few internal functions exist purely for this purpose, merge them to a single function. This patch doesn't change any existing functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-13net/mlx5: E-Switch, Update VF vports config when num of VFs changedBodong Wang1-64/+3
Currently, ECPF eswitch manager does one-time only configuration for VF vports when device switches to offloads mode. However, when num of VFs changed from host side, driver doesn't update VF vports configurations. Hence, perform VFs vport configuration update whenever num_vfs change event occurs. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-13net/mlx5: E-Switch, Introduce per vport configuration for eswitch modesBodong Wang1-85/+58
Both legacy and offload modes require vport setup, only offload mode requires rep setup. Before this patch, vport and rep operations are separated applied to all relevant vports in different stages. Change to use per vport configuration, so that vport and rep operations are modularized per vport. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-12net/mlx5: E-Switch, Add support for offloading rules with no in_portPaul Blakey1-1/+3
FTEs in global tables may match on packets from multiple in_ports. Provide the capability to omit the in_port match condition. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-12net/mlx5: E-Switch, Introduce global tablesPaul Blakey1-5/+13
Currently, flow tables are automatically connected according to their <chain,prio,level> tuple. Introduce global tables which are flow tables that are detached from the eswitch chains processing, and will be connected by explicitly referencing them from multiple chains. Add this new table type, and allow connecting them by refenece. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-12net/mlx5: E-Switch, Enable reg c1 loopback when possiblePaul Blakey1-11/+33
Enable reg c1 loopback if firmware reports it's supported, as this is needed for restoring packet metadata (e.g chain). Also define helper to query if it is enabled. Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-12Merge branch 'ct-offload' of ↵David S. Miller1-21/+220
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
2020-03-09net/mlx5: E-switch, make query inline mode a static functionParav Pandit1-37/+37
mlx5_eswitch_inline_mode_get() is used only in eswitch_offloads.c. Hence, make it static and adjacent to its caller function. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09net/mlx5e: Fix an IS_ERR() vs NULL checkDan Carpenter1-1/+1
The esw_vport_tbl_get() function returns error pointers on error. Fixes: 96e326878fa5 ("net/mlx5e: Eswitch, Use per vport tables for mirroring") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-09net/mlx5: E-Switch, Use vport metadata matching only when mandatoryMajd Dibbiny1-1/+13
Multi-port RoCE mode requires tagging traffic that passes through the vport. This matching can cause performance degradation, therefore disable it and use the legacy matching on vhca_id and source_port when possible. Fixes: 92ab1eb392c6 ("net/mlx5: E-Switch, Enable vport metadata matching if firmware supports it") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27net/mlx5e: Add devlink fdb_large_groups parameterJianbo Liu1-1/+3
Add a devlink parameter to control the number of large groups in a autogrouped flow table. The default value is 15, and the range is between 1 and 1024. The size of each large group can be calculated according to the following formula: size = 4M / (fdb_large_groups + 1). Examples: - Set the number of large groups to 20. $ devlink dev param set pci/0000:82:00.0 name fdb_large_groups \ cmode driverinit value 20 Then run devlink reload command to apply the new value. $ devlink dev reload pci/0000:82:00.0 - Read the number of large groups in flow table. $ devlink dev param show pci/0000:82:00.0 name fdb_large_groups pci/0000:82:00.0: name fdb_large_groups type driver-specific values: cmode driverinit value 20 Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27net/mlx5e: Eswitch, Use per vport tables for mirroringEli Cohen1-10/+196
When using port mirroring, we forward the traffic to another table and use that table to forward to the mirrored vport. Since the hardware loses the values of reg c, and in particular reg c0, we fail the match on the input vport which previously existed in reg c0. To overcome this situation, we use a set of per vport tables, positioned at the lowest priority, and forward traffic to those tables. Since these tables are per vport, we can avoid matching on reg c0. Fixes: c01cfd0f1115 ("net/mlx5: E-Switch, Add match on vport metadata for rule in fast path") Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19net/mlx5: E-Switch, Get reg_c1 value on missPaul Blakey1-3/+28
The HW model implicitly decapsulates tunnels on chain 0 and sets reg_c1 with the mapped tunnel id. On miss, the packet does not have the outer header and the driver restores the tunnel information from the tunnel id. Getting reg_c1 value in software requires enabling reg_c1 loopback and copying reg_c1 to reg_b. reg_b comes up on CQE as cqe->imm_inval_pkey. Use the reg_c0 restoration rules to also copy reg_c1 to reg_B. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19net/mlx5: E-Switch, Get reg_c0 value on CQEPaul Blakey1-9/+138
On RX side create a restore table in OFFLOADS namespace. This table will match on all values for reg_c0 we will use, and set it to the flow_tag. This flow tag can then be read on the CQE. As there is no copy action from reg c0 to flow tag, instead we have to set the flow tag explictily. We add an API so callers can add all the used reg_c0 values (tags) and for each of those we add a restore rule. This will be used in a following patch to save the miss chain mapping tag on reg_c0 and from it restore the tc chain on the skb. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-19net/mlx5: E-Switch, Move source port on reg_c0 to the upper 16 bitsPaul Blakey1-10/+54
Multi chain support requires the miss path to continue the processing from the last chain id, and for that we need to save the chain miss tag (a mapping for 32bit chain id) on reg_c0 which will come in a next patch. Currently reg_c0 is exclusively used to store the source port metadata, giving it 32bit, it is created from 16bits of vcha_id, and 16bits of vport number. We will move this source port metadata to upper 16bits, and leave the lower bits for the chain miss tag. We compress the reg_c0 source port metadata to 16bits by taking 8 bits from vhca_id, and 8bits from the vport number. Since we compress the vport number to 8bits statically, and leave two top ids for special PF/ECPF numbers, we will only support a max of 254 vports with this strategy. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Don't clear the whole vf config when switching modesDmytro Linkin1-2/+2
There is no need to reset all vf config (except link state) between legacy and switchdev modes changes. Also, set link state to AUTO, when legacy enabled. Fixes: 3b83b6c2e024 ("net/mlx5e: Clear VF config when switching modes") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-3/+8
Minor conflict in mlx5 because changes happened to code that has moved meanwhile. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24net/mlx5e: Clear VF config when switching modesDmytro Linkin1-3/+8
Currently VF in LEGACY mode are not able to go up. Also in OFFLOADS mode, when switching to it first time, VF can go up independently to his representor, which is not expected. Perform clearing of VF config when switching modes and set link state to AUTO as default value. Also, when switching to OFFLOADS mode set link state to DOWN, which allow VF link state to be controlled by its REP. Fixes: 1ab2068a4c66 ("net/mlx5: Implement vports admin state backup/restore") Fixes: 556b9d16d3f5 ("net/mlx5: Clear VF's configuration on disabling SRIOV") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5: Fix lowest FDB pool sizePaul Blakey1-1/+1
The pool sizes represent the pool sizes in the fw. when we request a pool size from fw, it will return the next possible group. We track how many pools the fw has left and start requesting groups from the big to the small. When we start request 4k group, which doesn't exists in fw, fw wants to allocate the next possible size, 64k, but will fail since its exhausted. The correct smallest pool size in fw is 128 and not 4k. Fixes: e52c28024008 ("net/mlx5: E-Switch, Add chains and priorities") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: E-Switch, Increase number of chains and prioritiesPaul Blakey1-1/+2
Increase the number of chains and priorities to support the whole range available in tc. We use unmanaged tables and ignore flow level to create more tables than what we declared to fs_core steering, and we manage the connections between the tables themselves. To support that we need FW with ignore_flow_level capability. Otherwise the old behaviour will be used, where we are limited by the number of levels we declared (4 chains, 16 prios). Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: E-Switch, Refactor chains and prioritiesPaul Blakey1-258/+45
To support the entire chain and prio range (32bit + 16bit), instead of a using a static array of chains/prios of limited size, create them dynamically, and use a rhashtable to search for existing chains/prio combinations. This will be used in next patch to actually increase the number using unamanged tables support and ignore flow level capability. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: ft: Use getter function to get ft chainPaul Blakey1-0/+5
FT chain is defined as the next chain after tc. To prepare for next patches that will increase the number of tc chains available at runtime, use a getter function to get this value. The define is still used in static fs_core allocation, to calculate the number of chains. This static allocation will be used if the relevant capabilities won't be available to support dynamic chains. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: Refactor mlx5_create_auto_grouped_flow_tablePaul Blakey1-6/+7
Refactor mlx5_create_auto_grouped_flow_table() to use ft_attr param which already carries the max_fte, prio and flags memebers, and is used the same in similar mlx5_create_flow_table() function. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-12-05net/mlx5e: E-switch, Fix Ingress ACL groups in switchdev mode for prio tagParav Pandit1-37/+85
In cited commit, when prio tag mode is enabled, FTE creation fails due to missing group with valid match criteria. Hence, (a) create prio tag group metadata_prio_tag_grp when prio tag is enabled with match criteria for vlan push FTE. (b) Rename metadata_grp to metadata_allmatch_grp to reflect its purpose. Also when priority tag is enabled, delete metadata settings after deleting ingress rules, which are using it. Tide up rest of the ingress config code for unnecessary labels. Fixes: 10652f39943e ("net/mlx5: Refactor ingress acl configuration") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-13Merge branch 'mlx5-next' of ↵Saeed Mahameed1-123/+146
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux 1) New generic devlink param "enable_roce", for downstream devlink reload support 2) Do vport ACL configuration on per vport basis when enabling/disabling a vport. This enables to have vports enabled/disabled outside of eswitch config for future 3) Split the code for legacy vs offloads mode and make it clear 4) Tide up vport locking and workqueue usage 5) Fix metadata enablement for ECPF 6) Make explicit use of VF property to publish IB_DEVICE_VIRTUAL_FUNCTION 7) E-Switch and flow steering core low level support and refactoring for netfilter flowtables offload Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-13net/mlx5: Accumulate levels for chains prio namespacesPaul Blakey1-1/+1
Tc chains are implemented by creating a chained prio steering type, and inside it there is a namespace for each chain (FDB_TC_MAX_CHAINS). Each of those has a list of priorities. Currently, all namespaces in a prio start at the parent prio level. But since we can jump from chain (namespace) to another chain in the same prio, we need the levels for higher chains to be higher as well. So we created unused prios to account for levels in previous namespaces. Fix that by accumulating the namespaces levels if we are inside a chained type prio, and removing the unused prios. Fixes: 328edb499f99 ('net/mlx5: Split FDB fast path prio to multiple namespaces') Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>