summaryrefslogtreecommitdiffstats
path: root/include
diff options
context:
space:
mode:
authorDavid S. Miller <davem@davemloft.net>2018-09-06 15:42:04 -0700
committerDavid S. Miller <davem@davemloft.net>2018-09-06 15:42:04 -0700
commitddc9cc0131619382678771b0b85632f28bcf2521 (patch)
tree4116a4ee819ecf54ac8c6c029c5353c7efcfc143 /include
parent2002bc328ca3e23b7b31849d823b67d4d7c1fd41 (diff)
parentfe1dc069990c1f290ef6b99adb46332c03258f38 (diff)
downloadlinux-ddc9cc0131619382678771b0b85632f28bcf2521.tar.bz2
Merge tag 'mlx5e-updates-2018-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says: ==================== mlx5e-updates-2018-09-05 This series provides updates to mlx5 ethernet driver. 1) Starting with a four patches series to optimize flow counters updates, From Vlad Buslov: ============================================== By default mlx5 driver updates cached counters each second. Update function consumes noticeable amount of CPU resources. The goal of this patch series is to optimize update function. Investigation revealed following bottlenecks in fs counters implementation: 1) Update code(scheduled each second) iterates over all counters twice. (first for finding and deleting counters that are marked for deletion, second iteration is for actually updating the counters) 2) Counters are stored in rb tree. Linear iteration over all rb tree elements(rb_next in profiling data) consumed ~65% of time spent in update function. Following optimizations were implemented: 1) Instead of just marking counters for deletion, store them in standalone list. This removes first iteration over whole counters tree. 2) Store counters in sorted list to optimize traversing them and remove calls to rb_next. First implementation of these changes caused degradation of performance, instead of improving it. Investigation revealed that there first cache line of struct mlx5_fc is full and adding anything to it causes amount of cache misses to double. To mitigate that, following refactorings were implemented: - Change 'addlist' list type from double linked to single linked. This allowes to get free space for one additional pointer that is used to store deletion list(optimization 1) - Substitute rb tree with idr. Idr is non-intrusive data structure and doesn't require adding any new members to struct mlx5_fc. Use free space that became available for double linked sorted list that is used for traversing all counters. (optimization 2) Described changes reduced CPU time spent in mlx5_fc_stats_work from 70% to 44%. (global perf profile mode) ============================================ The rest of the series are misc updates: 2) From Kamal, Move mlx5e_priv_flags into en_ethtool.c, to avoid a compilation warning. 3) From Roi Dayan, Move Q counters allocation and drop RQ to init_rx profile function to avoid allocating Q counters when not required. 4) From Shay Agroskin, Replace PTP clock lock from RW lock to seq lock. Almost double the packet rate when timestamping is active on multiple TX queues. 5) From: Natali Shechtman, set ECN for received packets using CQE indication. 6) From: Alaa Hleihel, don't set CHECKSUM_COMPLETE on SCTP packets. CHECKSUM_COMPLETE is not applicable to SCTP protocol. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'include')
-rw-r--r--include/linux/mlx5/driver.h11
1 files changed, 6 insertions, 5 deletions
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 7a452716de4b..b7fce2c9443d 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -583,10 +583,11 @@ struct mlx5_irq_info {
};
struct mlx5_fc_stats {
- struct rb_root counters;
- struct list_head addlist;
- /* protect addlist add/splice operations */
- spinlock_t addlist_lock;
+ spinlock_t counters_idr_lock; /* protects counters_idr */
+ struct idr counters_idr;
+ struct list_head counters;
+ struct llist_head addlist;
+ struct llist_head dellist;
struct workqueue_struct *wq;
struct delayed_work work;
@@ -804,7 +805,7 @@ struct mlx5_pps {
};
struct mlx5_clock {
- rwlock_t lock;
+ seqlock_t lock;
struct cyclecounter cycles;
struct timecounter tc;
struct hwtstamp_config hwtstamp_config;