summaryrefslogtreecommitdiffstats
path: root/net/wireless
diff options
context:
space:
mode:
authorRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>2015-08-30 11:29:42 +0530
committerDavid S. Miller <davem@davemloft.net>2015-08-30 21:48:58 -0700
commita3a773726c9f9ba2e87fd8ad8e36feff5f6ffd8e (patch)
treebc03244dc5119515d96c6c9a64b0c871ad554385 /net/wireless
parentc4c6bc314618f60ba69b0cbf93e506e4c38a11d2 (diff)
downloadlinux-a3a773726c9f9ba2e87fd8ad8e36feff5f6ffd8e.tar.bz2
net: Optimize snmp stat aggregation by walking all the percpu data at once
Docker container creation linearly increased from around 1.6 sec to 7.5 sec (at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field. reason: currently __snmp6_fill_stats64 calls snmp_fold_field that walks through per cpu data of an item (iteratively for around 36 items). idea: This patch tries to aggregate the statistics by going through all the items of each cpu sequentially which is reducing cache misses. Docker creation got faster by more than 2x after the patch. Result: Before After Docker creation time 6.836s 3.25s cache miss 2.7% 1.41% perf before: 50.73% docker [kernel.kallsyms] [k] snmp_fold_field 9.07% swapper [kernel.kallsyms] [k] snooze_loop 3.49% docker [kernel.kallsyms] [k] veth_stats_one 2.85% swapper [kernel.kallsyms] [k] _raw_spin_lock perf after: 10.57% docker docker [.] scanblock 8.37% swapper [kernel.kallsyms] [k] snooze_loop 6.91% docker [kernel.kallsyms] [k] snmp_get_cpu_field 6.67% docker [kernel.kallsyms] [k] veth_stats_one changes/ideas suggested: Using buffer in stack (Eric), Usage of memset (David), Using memcpy in place of unaligned_put (Joe). Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/wireless')
0 files changed, 0 insertions, 0 deletions