Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:
"The main changes in this cycle were:
- changes permitting use of call_rcu() and friends very early in
boot, for example, before rcu_init() is invoked.
- add in-kernel API to enable and disable expediting of normal RCU
grace periods.
- improve RCU's handling of (hotplug-) outgoing CPUs.
- NO_HZ_FULL_SYSIDLE fixes.
- tiny-RCU updates to make it more tiny.
- documentation updates.
- miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
cpu: Provide smpboot_thread_init() on !CONFIG_SMP kernels as well
cpu: Defer smpboot kthread unparking until CPU known to scheduler
rcu: Associate quiescent-state reports with grace period
rcu: Yet another fix for preemption and CPU hotplug
rcu: Add diagnostics to grace-period cleanup
rcutorture: Default to grace-period-initialization delays
rcu: Handle outgoing CPUs on exit from idle loop
cpu: Make CPU-offline idle-loop transition point more precise
rcu: Eliminate ->onoff_mutex from rcu_node structure
rcu: Process offlining and onlining only at grace-period start
rcu: Move rcu_report_unblock_qs_rnp() to common code
rcu: Rework preemptible expedited bitmask handling
rcu: Remove event tracing from rcu_cpu_notify(), used by offline CPUs
rcutorture: Enable slow grace-period initializations
rcu: Provide diagnostic option to slow down grace-period initialization
rcu: Detect stalls caused by failure to propagate up rcu_node tree
rcu: Eliminate empty HOTPLUG_CPU ifdef
rcu: Simplify sync_rcu_preempt_exp_init()
rcu: Put all orphan-callback-related code under same comment
rcu: Consolidate offline-CPU callback initialization
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"Some clean ups and small fixes, but the biggest change is the addition
of the TRACE_DEFINE_ENUM() macro that can be used by tracepoints.
Tracepoints have helper functions for the TP_printk() called
__print_symbolic() and __print_flags() that lets a numeric number be
displayed as a a human comprehensible text. What is placed in the
TP_printk() is also shown in the tracepoint format file such that user
space tools like perf and trace-cmd can parse the binary data and
express the values too. Unfortunately, the way the TRACE_EVENT()
macro works, anything placed in the TP_printk() will be shown pretty
much exactly as is. The problem arises when enums are used. That's
because unlike macros, enums will not be changed into their values by
the C pre-processor. Thus, the enum string is exported to the format
file, and this makes it useless for user space tools.
The TRACE_DEFINE_ENUM() solves this by converting the enum strings in
the TP_printk() format into their number, and that is what is shown to
user space. For example, the tracepoint tlb_flush currently has this
in its format file:
__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })
After adding:
TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);
Its format file will contain this:
__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })"
* tag 'trace-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (27 commits)
tracing: Add enum_map file to show enums that have been mapped
writeback: Export enums used by tracepoint to user space
v4l: Export enums used by tracepoints to user space
SUNRPC: Export enums in tracepoints to user space
mm: tracing: Export enums in tracepoints to user space
irq/tracing: Export enums in tracepoints to user space
f2fs: Export the enums in the tracepoints to userspace
net/9p/tracing: Export enums in tracepoints to userspace
x86/tlb/trace: Export enums in used by tlb_flush tracepoint
tracing/samples: Update the trace-event-sample.h with TRACE_DEFINE_ENUM()
tracing: Allow for modules to convert their enums to values
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
tracing: Update trace-event-sample with TRACE_SYSTEM_VAR documentation
tracing: Give system name a pointer
brcmsmac: Move each system tracepoints to their own header
iwlwifi: Move each system tracepoints to their own header
mac80211: Move message tracepoints to their own header
tracing: Add TRACE_SYSTEM_VAR to xhci-hcd
tracing: Add TRACE_SYSTEM_VAR to kvm-s390
tracing: Add TRACE_SYSTEM_VAR to intel-sst
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree from Jiri Kosina:
"Usual trivial tree updates. Nothing outstanding -- mostly printk()
and comment fixes and unused identifier removals"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
goldfish: goldfish_tty_probe() is not using 'i' any more
powerpc: Fix comment in smu.h
qla2xxx: Fix printks in ql_log message
lib: correct link to the original source for div64_u64
si2168, tda10071, m88ds3103: Fix firmware wording
usb: storage: Fix printk in isd200_log_config()
qla2xxx: Fix printk in qla25xx_setup_mode
init/main: fix reset_device comment
ipwireless: missing assignment
goldfish: remove unreachable line of code
coredump: Fix do_coredump() comment
stacktrace.h: remove duplicate declaration task_struct
smpboot.h: Remove unused function prototype
treewide: Fix typo in printk messages
treewide: Fix typo in printk messages
mod_devicetable: fix comment for match_flags
|
|
ACPICA commit 84f3569db7accc576ace2dae81d101467254fe9d
Was using %d instead of properly using %u.
This patch only affects acpidump tool.
Link: https://github.com/acpica/acpica/commit/84f3569d
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
ACPICA commit 9e2d8180f4d5e61949b17513bae8aff6412f62dd
The offset calculation needn't convert a pointer to a special integer type.
So this patch uses ACPI_TO_INTEGER() instead.
This patch only affects acpidump tool.
Link: https://github.com/acpica/acpica/commit/9e2d8180
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver updates from Greg KH:
"Here's the big staging driver patchset for 4.1-rc1.
There's a lot of patches here, the Outreachy application period
happened during this development cycle, so that means that there was a
lot of cleanup patches accepted. Other than the normal coding style
and sparse fixes here, there are some driver updates and work toward
making some of the drivers into "mergable" shape (like the Unisys
drivers.)
All of these have been in linux-next for a while"
* tag 'staging-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (1214 commits)
staging: lustre: orthography & coding style
staging: lustre: lnet: lnet: fix error return code
staging: lustre: fix sparse warning
Revert "Staging: sm750fb: Fix C99 Comments"
Staging: rtl8192u: use correct array for debug output
staging: rtl8192e: Remove dead code
staging: rtl8192e: Comment cleanup (style/format)
staging: rtl8192e: Fix indentation in rtllib_rx_auth_resp()
staging: rtl8192e: Decrease nesting of rtllib_rx_auth_resp()
staging: rtl8192e: Divide rtllib_rx_auth()
staging: rtl8192e: Fix PRINTK_WITHOUT_KERN_LEVEL warnings
staging: rtl8192e: Fix DO_WHILE_MACRO_WITH_TRAILING_SEMICOLON warning
staging: rtl8192e: Fix BRACES warning
staging: rtl8192e: Fix LINE_CONTINUATIONS warning
staging: rtl8192e: Fix UNNECESSARY_PARENTHESES warnings
staging: rtl8192e: remove unused EXPORT_SYMBOL_RSL macro
staging: rtl8192e: Fix RETURN_VOID warnings
staging: rtl8192e: Fix UNNECESSARY_ELSE warning
staging: rtl8723au: Remove unneeded comments
staging: rtl8723au: Use __func__ in trace logs
...
|
|
The first argument passed to find_probe_point_lazy() should be CU die,
which will be passed to die_walk_lines() when lazy_line matches.
Currently, when we probe with lazy_line pattern to file without function
name, NULL pointer is passed and causes a segment fault.
Can be reproduced as following:
$ perf probe -k vmlinux --add='fs/super.c;s->s_count=1;'
[ 1958.984658] perf[1020]: segfault at 10 ip 00007fc6e10d8c71 sp
00007ffcbfaaf900 error 4 in libdw-0.161.so[7fc6e10ce000+34000]
Segmentation fault
After this patch:
$ perf probe -k vmlinux --add='fs/super.c;s->s_count=1;'
Added new event:
probe:_stext (on @fs/super.c)
You can now use it in all perf tools, such as:
perf record -e probe:_stext -aR sleep 1
Signed-off-by: He Kuang <hekuang@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1428925290-5623-3-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If we use lazy matching, it failed to open a souce file if perf command
is invoked outside of compilation directory:
$ perf probe -a '__schedule;clear_*'
Failed to open kernel/sched/core.c: No such file or directory
Error: Failed to add events. (-2)
OTOH, other commands like "probe -L" can solve the souce directory by
themselves. Let's make it possible for lazy matching too!
Signed-off-by: Naohiro Aota <naota@elisp.net>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1426223923-1493-1-git-send-email-naota@elisp.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When perf probe searched in a debuginfo file and failed, it tried with
an alternative, in function get_alternative_probe_event():
memcpy(tmp, &pev->point, sizeof(*tmp));
memset(&pev->point, 0, sizeof(pev->point));
In this case, it drops the retprobe flag and forgets to set it back in
find_alternative_probe_point(), so the problem occurs.
Can be reproduced as following:
$ perf probe -v -k vmlinux --add='sys_write%return'
...
Added new event:
Writing event: p:probe/sys_write _stext+1584952
probe:sys_write (on sys_write%return)
$ cat /sys/kernel/debug/tracing/kprobe_events
p:probe/sys_write _stext+1584952
After this patch:
$ perf probe -v -k vmlinux --add='sys_write%return'
Added new event:
Writing event: r:probe/sys_write SyS_write+0
probe:sys_write (on sys_write%return)
$ cat /sys/kernel/debug/tracing/kprobe_events
r:probe/sys_write SyS_write
Signed-off-by: He Kuang <hekuang@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1428925290-5623-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm changes from Ingo Molnar:
"There were lots of changes in this development cycle:
- over 100 separate cleanups, restructuring changes, speedups and
fixes in the x86 system call, irq, trap and other entry code, part
of a heroic effort to deobfuscate a decade old spaghetti asm code
and its C code dependencies (Denys Vlasenko, Andy Lutomirski)
- alternatives code fixes and enhancements (Borislav Petkov)
- simplifications and cleanups to the compat code (Brian Gerst)
- signal handling fixes and new x86 testcases (Andy Lutomirski)
- various other fixes and cleanups
By their nature many of these changes are risky - we tried to test
them well on many different x86 systems (there are no known
regressions), and they are split up finely to help bisection - but
there's still a fair bit of residual risk left so caveat emptor"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (148 commits)
perf/x86/64: Report regs_user->ax too in get_regs_user()
perf/x86/64: Simplify regs_user->abi setting code in get_regs_user()
perf/x86/64: Do report user_regs->cx while we are in syscall, in get_regs_user()
perf/x86/64: Do not guess user_regs->cs, ss, sp in get_regs_user()
x86/asm/entry/32: Tidy up JNZ instructions after TESTs
x86/asm/entry/64: Reduce padding in execve stubs
x86/asm/entry/64: Remove GET_THREAD_INFO() in ret_from_fork
x86/asm/entry/64: Simplify jumps in ret_from_fork
x86/asm/entry/64: Remove a redundant jump
x86/asm/entry/64: Optimize [v]fork/clone stubs
x86/asm/entry: Zero EXTRA_REGS for stub32_execve() too
x86/asm/entry/64: Move stub_x32_execvecloser() to stub_execveat()
x86/asm/entry/64: Use common code for rt_sigreturn() epilogue
x86/asm/entry/64: Add forgotten CFI annotation
x86/asm/entry/irq: Simplify interrupt dispatch table (IDT) layout
x86/asm/entry/64: Move opportunistic sysret code to syscall code path
x86, selftests: Add sigreturn selftest
x86/alternatives: Guard NOPs optimization
x86/asm/entry: Clear EXTRA_REGS for all executable formats
x86/signal: Remove pax argument from restore_sigcontext
...
|
|
syntax only.
The cool kids are now using the phrase "base frequency",
where in the past we used "max non-turbo frequency" or "TSC frequency".
This distinction becomes important when a processor has a TSC
that runs at a different speed than the "base frequency".
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
cosmetic only.
order the decoding of MSR_PERF_LIMIT_REASONS bits
from MSB to LSB -- which you notice when more than 1 bit is set
and you are, say, comparing the output to the documentation...
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Casual turbostat users generally just want to know MHz.
So by default, just print enough information to make sense of MHz.
All the other configuration data and columns for C-states and temperature etc,
are printed with the --debug option.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest updates from Shuah Khan:
"This is a milestone update in a sense. Several new tests and install
and packaging support is added in this update.
This update adds install and packaging tools developed on top of
back-end shared logic enhancemnets to run and install tests. In
addition several timer tests are added.
- New timer tests from John Stultz
- rtc test from Prarit Bhargava
- Enhancements to un and install tests from Michael Ellerman
- Install and packaging tools from Shuah Khan
- Cross-compilation enablement from Tyler Baker
- A couple of bug fixes"
* tag 'linux-kselftest-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (42 commits)
ftracetest: Do not use usleep directly
selftest/mqueue: enable cross compilation
selftest/ipc: enable cross compilation
selftest/memfd: include default header install path
selftest/mount: enable cross compilation
selftest/memfd: enable cross compilation
kselftests: timers: Make set-timer-lat fail more gracefully for !CAP_WAKE_ALARM
selftests: Change memory on-off-test.sh name to be unique
selftests: change cpu on-off-test.sh name to be unique
selftests/mount: Make git ignore all binaries in mount test suite
kselftests: timers: Reduce default runtime on inconsistency-check and set-timer-lat
ftracetest: Convert exit -1 to exit $FAIL
ftracetest: Cope properly with stack tracer not being enabled
tools, update rtctest.c to verify passage of time
Documentation, split up rtc.txt into documentation and test file
selftests: Add tool to generate kselftest tar archive
selftests: Add kselftest install tool
selftests: Set CC using CROSS_COMPILE once in lib.mk
selftests: Add install support for the powerpc tests
selftests/timers: Use shared logic to run and install tests
...
|
|
The perf kmem command records and analyze kernel memory allocation only
for SLAB objects. This patch implement a simple page allocator analyzer
using kmem:mm_page_alloc and kmem:mm_page_free events.
It adds two new options of --slab and --page. The --slab option is for
analyzing SLAB allocator and that's what perf kmem currently does.
The new --page option enables page allocator events and analyze kernel
memory usage in page unit. Currently, 'stat --alloc' subcommand is
implemented only.
If none of these --slab nor --page is specified, --slab is implied.
First run 'perf kmem record' to generate a suitable perf.data file:
# perf kmem record --page sleep 5
Then run 'perf kmem stat' to postprocess the perf.data file:
# perf kmem stat --page --alloc --line 10
-------------------------------------------------------------------------------
PFN | Total alloc (KB) | Hits | Order | Mig.type | GFP flags
-------------------------------------------------------------------------------
4045014 | 16 | 1 | 2 | RECLAIM | 00285250
4143980 | 16 | 1 | 2 | RECLAIM | 00285250
3938658 | 16 | 1 | 2 | RECLAIM | 00285250
4045400 | 16 | 1 | 2 | RECLAIM | 00285250
3568708 | 16 | 1 | 2 | RECLAIM | 00285250
3729824 | 16 | 1 | 2 | RECLAIM | 00285250
3657210 | 16 | 1 | 2 | RECLAIM | 00285250
4120750 | 16 | 1 | 2 | RECLAIM | 00285250
3678850 | 16 | 1 | 2 | RECLAIM | 00285250
3693874 | 16 | 1 | 2 | RECLAIM | 00285250
... | ... | ... | ... | ... | ...
-------------------------------------------------------------------------------
SUMMARY (page allocator)
========================
Total allocation requests : 44,260 [ 177,256 KB ]
Total free requests : 117 [ 468 KB ]
Total alloc+freed requests : 49 [ 196 KB ]
Total alloc-only requests : 44,211 [ 177,060 KB ]
Total free-only requests : 68 [ 272 KB ]
Total allocation failures : 0 [ 0 KB ]
Order Unmovable Reclaimable Movable Reserved CMA/Isolated
----- ------------ ------------ ------------ ------------ ------------
0 32 . 44,210 . .
1 . . . . .
2 . 18 . . .
3 . . . . .
4 . . . . .
5 . . . . .
6 . . . . .
7 . . . . .
8 . . . . .
9 . . . . .
10 . . . . .
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1428298576-9785-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Check that a syscall made during an active transaction will fail with
the correct failure code and that one made during a suspended
transaction will succeed.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Move get_auxv_entry() from pmu/lib.c up to harness.c in order to make
it available to other tests.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The data_head and data_tail fields are defined as __u64 in
linux/perf_event.h, but perf userspace uses int and unsigned int.
Convert all references to u64 for consistency.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1428420037-26599-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To avoid probing in unintended binary, the orphaned -x option must be
checked and warned.
Without this patch, following command sets up the probe in the kernel.
-----
# perf probe -a strcpy -x ./perf
Added new event:
probe:strcpy (on strcpy)
You can now use it in all perf tools, such as:
perf record -e probe:strcpy -aR sleep 1
-----
But in this case, it seems that the user may want to probe in the perf
binary. With this patch, perf-probe correctly handles the orphaned -x.
-----
# perf probe -a strcpy -x ./perf
Error: -x/-m must follow the probe definitions.
...
-----
Reported-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150401102541.17137.75477.stgit@localhost.localdomain
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Support multiple probes on different binaries with just
one command.
In the result, this example sets up the probes on icmp_rcv in
kernel, on main and set_target in perf, and on pcspkr_event
in pcspker.ko driver.
-----
# perf probe -a icmp_rcv -x ./perf -a main -a set_target \
-m /lib/modules/4.0.0-rc5+/kernel/drivers/input/misc/pcspkr.ko \
-a pcspkr_event
Added new event:
probe:icmp_rcv (on icmp_rcv)
You can now use it in all perf tools, such as:
perf record -e probe:icmp_rcv -aR sleep 1
Added new event:
probe_perf:main (on main in /home/mhiramat/ksrc/linux-3/tools/perf/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:main -aR sleep 1
Added new event:
probe_perf:set_target (on set_target in /home/mhiramat/ksrc/linux-3/tools/perf/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:set_target -aR sleep 1
Added new event:
probe:pcspkr_event (on pcspkr_event in pcspkr)
You can now use it in all perf tools, such as:
perf record -e probe:pcspkr_event -aR sleep 1
-----
Reported-by: Arnaldo Carvalho de Melo <acme@infradead.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150401102539.17137.46454.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
commit: f3b623b8490a ("perf tools: Reference count struct thread")
appends every thread->node to dead_threads in machine__remove_thread()
and list_del_init() this node in thread__put().
perf_event__exit_del_thread() releases thread wihout using
machine__remove_thread(), and causes a NULL pointer crash when
list_del_init(&thread->node) is called. Fix this by using
machine_remove_thread() instead of using thread__put() directly.
This problem can be reproduced as following:
$ perf record ls
$ perf buildid-list --with-hits
[ 3874.195070] perf[1018]: segfault at 0 ip 00000000004b0b15 sp
00007ffc35b44780 error 6 in perf[400000+166000]
Segmentation fault
After this patch:
$ perf record ls
$ perf buildid-list --with-hits
bc23e7c3281e542650ba4324421d6acf78f4c23e /proc/kcore
643324cb0e969f30c56d660f167f84a150845511 [vdso]
0000000000000000000000000000000000000000 /bin/busybox
...
Signed-off-by: He Kuang <hekuang@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1428658500-6483-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Trying to analyze a big endian data file on little endian system fails
with the error:
0xa9b40 [0x70]: failed to process type: 9
The problem is that header parsing is not done correctly because the
file attributes are not swapped. Make it so. With this patch able to
analyze a sparc64 data file on x86_64.
Signed-off-by: David Ahern <david.ahern@oracle.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1428610546-178789-1-git-send-email-david.ahern@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When traversing /proc to synthesize the PERF_RECORD_FORK et al events we
were bailing out on errors without calling closedir(), fix it.
Reported-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-vxtp593rfztgbi8noy0m967p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit ca6c41c59b9 sets the ppid based on what is read from the
/proc/pid/status file when synthesizing fork events.
This is correct thing to do for new processes but not threads of a
process.
Fix ppid for threads to be the main thread when synthesizing fork events
(ie., assume main thread spawned all sub-threads in a process).
Reported-by: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Signed-off-by: David Ahern <david.ahern@oracle.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1428598107-178999-1-git-send-email-david.ahern@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Adding 'I' event modifier to have complete set of modifiers for
perf_event_attr:exclude_* bits.
Any event specified with 'I' modifier will have the
perf_event_attr:exclude_idle bit set.
$ perf record -e cycles:I -vv ls 2>&1 | grep exclude_idle
exclude_hv 0 exclude_idle 1
Adding automated tests.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: William Cohen <wcohen@redhat.com>
Link: http://lkml.kernel.org/r/1428441919-23099-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
report__warn_kptr_restrict() calls map__kmap(kernel_map) before checking
kernel_map againest NULL.
Which is dangerous, since map__kmap() will return a invalid and not NULL
address.
It will trigger a warning message in map__kmap() after the patch "perf:
kmaps: enforce usage of kmaps to protect futher bugs." was applied.
This patch fixes it by adding the missing checking.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1428490772-135393-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Following commit:
1a5941312414 perf: Add wakeup watermark control to the AUX area
enlarged perf_event_attr, but did not updated attr tests.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Kaixu Xia <kaixu.xia@linaro.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: http://lkml.kernel.org/n/20150407171715.GA22603@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit 9b118acae310f57baee770b5db402500d8695e50 ("perf probe: Fix to
handle aliased symbols in glibc") uses an absolute format '%lx' to
print u64 argument, which causes compiling error on ARM 32.
This patch replaces it with PRIx64.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1428459274-138470-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently there's 3 (that I found) different and incomplete
implementations of printing perf_event_attr.
This is quite silly. Merge the lot.
While this patch does not retain the exact form all printing that I
found is debug output and thus it should not be critical.
Also, I cannot find a single print_event_desc() caller.
Pre:
$ perf record -vv -e cycles -- sleep 1
------------------------------------------------------------
perf_event_attr:
type 0
size 104
config 0
sample_period 4000
sample_freq 4000
sample_type 0x107
read_format 0
disabled 1 inherit 1
pinned 0 exclusive 0
exclude_user 0 exclude_kernel 0
exclude_hv 0 exclude_idle 0
mmap 1 comm 1
mmap2 1 comm_exec 1
freq 1 inherit_stat 0
enable_on_exec 1 task 1
watermark 0 precise_ip 0
mmap_data 0 sample_id_all 1
exclude_host 0 exclude_guest 1
excl.callchain_kern 0 excl.callchain_user 0
wakeup_events 0
wakeup_watermark 0
bp_type 0
bp_addr 0
config1 0
bp_len 0
config2 0
branch_sample_type 0
sample_regs_user 0
sample_stack_user 0
sample_regs_intr 0
------------------------------------------------------------
$ perf evlist -vv
cycles: sample_freq=4000, size: 104, sample_type: IP|TID|TIME|PERIOD,
disabled: 1, inherit: 1, mmap: 1, mmap2: 1, comm: 1, comm_exec: 1,
freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1
Post:
$ ./perf record -vv -e cycles -- sleep 1
------------------------------------------------------------
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|PERIOD
disabled 1
inherit 1
mmap 1
comm 1
freq 1
enable_on_exec 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
------------------------------------------------------------
$ ./perf evlist -vv
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq:
1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1,
mmap2: 1, comm_exec: 1
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150407091150.644238729@infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Teach perf-record about the new perf_event_attr::{use_clockid, clockid}
fields. Add a simple parameter to set the clock (if any) to be used for
the events to be recorded into the data file.
Since we store the entire perf_event_attr in the EVENT_DESC section we
also already store the used clockid in the data file.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yunlong Song <yunlong.song@huawei.com>
Link: http://lkml.kernel.org/r/20150407154851.GR23123@twins.programming.kicks-ass.net
[ Conditionally define CLOCK_BOOTTIME, at least rhel6 doesn't have it - dsahern
Ditto for CLOCK_MONOTONIC_RAW, sles11sp2 doesn't have it - yunlong.song ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
instead of the default value 10
Since sched->replay_repeat is set to 10 as default, the sched->run_avg,
sched->runavg_cpu_usage, and sched->runavg_parent_cpu_usage all use
10 to calculate their value.
However, the replay_repeat can be changed to other value by using -r
option, so the calculation above should use replay_repeat to achieve
more accurate results instead of the default value 10.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-10-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Enable to use perf.data when it is not owned by current user or root.
Example:
$ ls -al perf.data
-rw------- 1 Yunlong.Song Yunlong.Song 5321918 Mar 25 15:14 perf.data
$ sudo id
uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
Before this patch:
$ sudo perf sched replay -f
run measurement overhead: 98 nsecs
sleep measurement overhead: 52909 nsecs
the run test took 1000015 nsecs
the sleep test took 1054253 nsecs
File perf.data not owned by current user or root (use -f to override)
As shown above, the -f option does not work at all.
After this patch:
$ sudo perf sched replay -f
run measurement overhead: 221 nsecs
sleep measurement overhead: 40514 nsecs
the run test took 1000003 nsecs
the sleep test took 1056098 nsecs
nr_run_events: 10
nr_sleep_events: 1562
nr_wakeup_events: 5
task 0 ( :1: 1), nr_events: 1
task 1 ( :2: 2), nr_events: 1
task 2 ( :3: 3), nr_events: 1
...
...
task 1549 ( :163132: 163132), nr_events: 1
task 1550 ( :163540: 163540), nr_events: 1
task 1551 ( <unknown>: 0), nr_events: 10
------------------------------------------------------------
#1 : 50.198, ravg: 50.20, cpu: 2335.18 / 2335.18
#2 : 219.099, ravg: 67.09, cpu: 2835.11 / 2385.17
#3 : 238.626, ravg: 84.24, cpu: 3278.26 / 2474.48
#4 : 200.364, ravg: 95.85, cpu: 2977.41 / 2524.77
#5 : 176.882, ravg: 103.96, cpu: 2801.35 / 2552.43
#6 : 191.093, ravg: 112.67, cpu: 2813.70 / 2578.56
#7 : 189.448, ravg: 120.35, cpu: 2809.21 / 2601.62
#8 : 200.637, ravg: 128.38, cpu: 2849.91 / 2626.45
#9 : 248.338, ravg: 140.37, cpu: 4380.61 / 2801.87
#10 : 511.139, ravg: 177.45, cpu: 3077.73 / 2829.45
As shown above, the -f option really works now.
Besides for replay, -f option can also work for latency and map.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-9-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
maximum open files
The soft maximum number of open files for a calling process is 1024,
which is defined as INR_OPEN_CUR in include/uapi/linux/fs.h, and the
hard maximum number of open files for a calling process is 4096, which
is defined as INR_OPEN_MAX in include/uapi/linux/fs.h.
Both INR_OPEN_CUR and INR_OPEN_MAX are used to limit the value of
RLIMIT_NOFILE in include/asm-generic/resource.h.
And the soft maximum number finally decides the limitation of the
maximum files which are allowed to be opened.
That is to say a process can use at most 1024 file descriptors for its
o pened files, or an EMFILE error will happen.
This error can be fixed by increasing the soft maximum number, under the
constraint that the soft maximum number can not exceed the hard maximum
number, or both soft and hard maximum number should be increased
simultaneously with privilege.
For perf sched replay, it uses sys_perf_event_open to create the file
descriptor for each of the tasks in order to handle information of perf
events.
That is to say each task needs a unique file descriptor. In x86_64,
there may be over 1024 or 4096 tasks correspoinding to the record in
perf.data, which causes that no enough file descriptors can be used.
As a result, EMFILE error happens and stops the replay process. To solve
this problem, we adaptively increase the soft and hard maximum number of
open files with a '-f' option.
Example:
Test environment: x86_64 with 160 cores
$ cat /proc/sys/kernel/pid_max
163840
$ cat /proc/sys/fs/file-max
6815744
$ ulimit -Sn
1024
$ ulimit -Hn
4096
Before this patch:
$ perf sched replay
...
task 1549 ( :163132: 163132), nr_events: 1
task 1550 ( :163540: 163540), nr_events: 1
task 1551 ( <unknown>: 0), nr_events: 10
Error: sys_perf_event_open() syscall returned with -1 (Too many open
files)
After this patch:
$ perf sched replay
...
task 1549 ( :163132: 163132), nr_events: 1
task 1550 ( :163540: 163540), nr_events: 1
task 1551 ( <unknown>: 0), nr_events: 10
Error: sys_perf_event_open() syscall returned with -1 (Too many open
files)
Have a try with -f option
$ perf sched replay -f
...
task 1549 ( :163132: 163132), nr_events: 1
task 1550 ( :163540: 163540), nr_events: 1
task 1551 ( <unknown>: 0), nr_events: 10
------------------------------------------------------------
#1 : 54.401, ravg: 54.40, cpu: 3285.21 / 3285.21
#2 : 199.548, ravg: 68.92, cpu: 4999.65 / 3456.66
#3 : 170.483, ravg: 79.07, cpu: 1349.94 / 3245.99
#4 : 192.034, ravg: 90.37, cpu: 1322.88 / 3053.67
#5 : 182.929, ravg: 99.62, cpu: 1406.51 / 2888.96
#6 : 152.974, ravg: 104.96, cpu: 1167.54 / 2716.82
#7 : 155.579, ravg: 110.02, cpu: 2992.53 / 2744.39
#8 : 130.557, ravg: 112.08, cpu: 1126.43 / 2582.59
#9 : 138.520, ravg: 114.72, cpu: 1253.22 / 2449.65
#10 : 134.328, ravg: 116.68, cpu: 1587.95 / 2363.48
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-8-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
fails for any task
Since there is sem_wait for each task in the wait_for_tasks(), e.g.
sem_wait(&task->work_done_sem).
The sem_wait can continue only when work_done_sem is greater than 0, or
it will be blocked.
For perf sched replay, one task may sem_post the work_done_sem of
another task, which causes the work_done_sem of that task processed in a
reasonable sequence, e.g. sem_post, sem_wait, sem_wait, sem_post...
This sequence simulates the sched process of the running tasks at the
time when perf sched record runs.
As a result, all the tasks are required and their threads must be
successfully created.
If any one (task A) of the tasks fails to create its thread, then
another task (task B), whose work_done_sem needs sem_post from that
failed task A, may likely block itself due to seg_wait.
And this is a dead halt, since task B's thread_func cannot continue at
all.
To solve this problem, perf sched replay should exit once any task fails
to create its thread.
Example:
Test environment: x86_64 with 160 cores
Before this patch:
$ perf sched replay
...
Error: sys_perf_event_open() syscall returned with -1 (Too many open
files)
------------------------------------------------------------ <- dead halt
After this patch:
$ perf sched replay
...
task 1551 ( <unknown>: 0), nr_events: 10
Error: sys_perf_event_open() syscall returned with -1 (Too many open
files)
$
As shown above, perf sched replay finishes the process after printing an
error message and does not block itself.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-7-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
threads
The pr_err in self_open_counters() prints error message to stderr.
Unlike stdout, stderr uses memory buffer on the stack of each calling
process.
The pr_err in self_open_counters() works in a thread called thread_func
created in function create_tasks, which concurrently creates
sched->nr_tasks threads.
If the error happens and pr_err prints the error message in each of
these threads, the stack size of the perf process (default is 8192
kbytes) will quickly run out and the segmentation fault will happen
then.
To solve this problem, pr_err with self_open_counters() should be moved
from newly created threads to the old main thread of the perf process.
Then the pr_err can work in a stable situation without the strange
segmentation fault problem.
Example:
Test environment: x86_64 with 160 cores
Before this patch:
$ perf sched replay
...
task 1549 ( :163132: 163132), nr_events: 1
task 1550 ( :163540: 163540), nr_events: 1
task 1551 ( <unknown>: 0), nr_events: 10
Segmentation fault
After this patch:
$ perf sched replay
...
task 1549 ( :163132: 163132), nr_events: 1
task 1550 ( :163540: 163540), nr_events: 1
task 1551 ( <unknown>: 0), nr_events: 10
...
As shown above, the result continues without any segmentation fault.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-6-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
the different pid_max configurations
Although the memory of pid_to_task can be allocated via calloc according
to the value of /proc/sys/kernel/pid_max, it cannot handle the case when
pid_max is changed after 'perf sched record' has created its perf.data.
If the new pid_max configured in 'perf sched replay' is smaller than the
old pid_max configured in 'perf sched record', then it will cause the
assertion failure problem.
To solve this problem, we realloc the memory of pid_to_task stepwise
once the passed-in pid parameter in register_pid is larger than the
current pid_max.
Example:
Test environment: x86_64 with 160 cores
$ cat /proc/sys/kernel/pid_max
163840
$ perf sched record ls
$ echo 5000 > /proc/sys/kernel/pid_max
$ cat /proc/sys/kernel/pid_max
5000
Before this patch:
$ perf sched replay
run measurement overhead: 221 nsecs
sleep measurement overhead: 55356 nsecs
the run test took 1000011 nsecs
the sleep test took 1060940 nsecs
perf: builtin-sched.c:337: register_pid: Assertion `!(pid >= (unsigned
long)pid_max)' failed.
Aborted
After this patch:
$ perf sched replay
run measurement overhead: 221 nsecs
sleep measurement overhead: 55611 nsecs
the run test took 1000026 nsecs
the sleep test took 1060486 nsecs
nr_run_events: 10
nr_sleep_events: 1562
nr_wakeup_events: 5
task 0 ( :1: 1), nr_events: 1
task 1 ( :2: 2), nr_events: 1
task 2 ( :3: 3), nr_events: 1
task 3 ( :5: 5), nr_events: 1
...
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-5-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
the unexpected change of pid_max
The current memory allocation of struct task_desc *pid_to_task[MAX_PID]
is in a permanent and preset way, and it has two problems:
Problem 1: If the pid_max, which is the max number of pids in the
system, is much smaller than MAX_PID (1024*1000), then it causes a waste
of stack memory. This may happen in the case where the number of cpu
cores is much smaller than 1000.
Problem 2: If the pid_max is changed from the default value to a value
larger than MAX_PID, then it will cause assertion failure problem. The
maximum value of pid_max can be set to pid_max_max (see pidmap_init
defined in kernel/pid.c), which equals to PID_MAX_LIMIT. In x86_64,
PID_MAX_LIMIT is 4*1024*1024 (defined in include/linux/threads.h). This
value is much larger than MAX_PID, and will take up 32768 Kbytes
(4*1024*1024*8/1024) for memory allocation of pid_to_task, which is much
larger than the default 8192 Kbytes of the stack size of calling
process.
Due to these two problems, we use calloc to allocate the memory of
pid_to_task dynamically.
Example:
Test environment: x86_64 with 160 cores
$ cat /proc/sys/kernel/pid_max
163840
$ echo 1025000 > /proc/sys/kernel/pid_max
$ cat /proc/sys/kernel/pid_max
1025000
Run some applications until the pid of some process is greater than
the value of MAX_PID (1024*1000).
Before this patch:
$ perf sched replay
run measurement overhead: 221 nsecs
sleep measurement overhead: 55480 nsecs
the run test took 1000008 nsecs
the sleep test took 1063151 nsecs
perf: builtin-sched.c:330: register_pid: Assertion `!(pid >= 1024000)'
failed.
Aborted
After this patch:
$ perf sched replay
run measurement overhead: 221 nsecs
sleep measurement overhead: 55435 nsecs
the run test took 1000004 nsecs
the sleep test took 1059312 nsecs
nr_run_events: 10
nr_sleep_events: 1562
nr_wakeup_events: 5
task 0 ( :1: 1), nr_events: 1
task 1 ( :2: 2), nr_events: 1
task 2 ( :3: 3), nr_events: 1
task 3 ( :5: 5), nr_events: 1
...
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-4-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Current MAX_PID is only 65536, which will cause assertion failure problem
when CPU cores are more than 64 in x86_64.
This is because the pid_max value in x86_64 is at least
PIDS_PER_CPU_DEFAULT * num_possible_cpus() (see function pidmap_init
defined in kernel/pid.c), where PIDS_PER_CPU_DEFAULT is 1024 (defined in
include/linux/threads.h).
Thus for MAX_PID = 65536, the correspoinding CPU cores are
65536/1024=64. This is obviously not enough at all for x86_64, and will
cause an assertion failure problem due to BUG_ON(pid >= MAX_PID) in the
codes.
We increase MAX_PID value from 65536 to 1024*1000, which can be used in
x86_64 with 1000 cores.
This number is finally decided according to the limitation of stack size
of calling process.
Use 'ulimit -a', the result shows the stack size of any process is 8192
Kbytes, which is defined in include/uapi/linux/resource.h (#define
_STK_LIM (8*1024*1024)).
Thus we choose a large enough value for MAX_PID, and make it satisfy to
the limitation of the stack size, i.e., making the perf process take up
a memory space just smaller than 8192 Kbytes.
We have calculated and tested that 1024*1000 is OK for MAX_PID.
This means perf sched replay can now be used with at most 1000 cores in
x86_64 without any assertion failure problem.
Example:
Test environment: x86_64 with 160 cores
$ cat /proc/sys/kernel/pid_max
163840
Before this patch:
$ perf sched replay
run measurement overhead: 240 nsecs
sleep measurement overhead: 55379 nsecs
the run test took 1000004 nsecs
the sleep test took 1059424 nsecs
perf: builtin-sched.c:330: register_pid: Assertion `!(pid >= 65536)'
failed.
Aborted
After this patch:
$ perf sched replay
run measurement overhead: 221 nsecs
sleep measurement overhead: 55397 nsecs
the run test took 999920 nsecs
the sleep test took 1053313 nsecs
nr_run_events: 10
nr_sleep_events: 1562
nr_wakeup_events: 5
task 0 ( :1: 1), nr_events: 1
task 1 ( :2: 2), nr_events: 1
task 2 ( :3: 3), nr_events: 1
task 3 ( :5: 5), nr_events: 1
...
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-3-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
correct meaning
There is no struct task_task at all, thus it is a typo error in the old
commits, now fix it to what it should be in order to avoid unnecessary
misunderstanding.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-2-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently the perf kmem does not respect -i option.
Initializing the file.path properly after options get parsed.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1428298576-9785-2-git-send-email-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently it ignores operator priority and just sets processed args as a
right operand. But it could result in priority inversion in case that
the right operand is also a operator arg and its priority is lower.
For example, following print format is from new kmem events.
"page=%p", REC->pfn != -1UL ? (((struct page *)(0xffffea0000000000UL)) + (REC->pfn)) : ((void *)0)
But this was treated as below:
REC->pfn != ((null - 1UL) ? ((struct page *)0xffffea0000000000UL + REC->pfn) : (void *) 0)
In this case, the right arg was '?' operator which has lower priority.
But it just sets the whole arg so making the output confusing - page was
always 0 or 1 since that's the result of logical operation.
With this patch, it can handle it properly like following:
((REC->pfn != (null - 1UL)) ? ((struct page *)0xffffea0000000000UL + REC->pfn) : (void *) 0)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1428298576-9785-10-git-send-email-namhyung@kernel.org
[ Replaced 'swap' with 'rotate' in a comment as requested by Steve and agreed by Namhyung ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch add checks in places where map__kmap is used to get kmaps
from struct kmap.
Error messages are added at map__kmap to warn invalid accessing of kmap
(for the case of !map->dso->kernel, kmap(map) does not exists at all).
Also, introduces map__kmaps() to warn uninitialized kmaps.
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: pi3orama@163.com
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1428394966-131044-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_evlist__mmap_consume() uses perf_mmap__empty() to judge whether
perf_mmap is empty and can be released. But the result is inverted so
fix it.
Signed-off-by: He Kuang <hekuang@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1428399071-7141-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Conflicts:
arch/x86/kernel/entry_64.S
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This is my sigreturn test, added mostly unchanged from its old
home. It exercises the sigreturn(2) syscall, specifically
focusing on its interactions with various IRET corner cases.
It tests for correct behavior in several areas that were
historically dangerously buggy. For example, it exercises espfix
on kernels of both bitnesses under various conditions, and it
contains testcases for several now-fixed bugs in IRET error
handling.
If you run it on older kernels without the fixes, your system will
crash. It probably won't eat your data in the process.
There is no released kernel on which the sigreturn_64 test will
pass, but it passes on tip:x86/asm.
I plan to switch to lib.mk for Linux 4.2.
I'm not using the ksft_ helpers at all yet. I can do that later.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Shuah Khan <shuah.kh@samsung.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/89d10b76b92c7202d8123654dc8d36701c017b3d.1428386971.git.luto@kernel.org
[ Fixed empty format string GCC build warning in trivial_32bit_program.c ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
We want those fixes (iio primarily) into the -next branch to help with
merge and testing issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The usleep is only provided on distros from Redhat so running ftracetest
on other distro resulted in failures due to the missing usleep.
The reason of using [u]sleep in the test was to generate (scheduler)
events. It can be done various ways like this:
yield() { ping localhost -c 1 || sleep .001 || usleep 1 || sleep 1; }
For more information to the history of this patch, please refer to:
Link: http://lkml.kernel.org/r/1427329943-16896-1-git-send-email-namhyung@kernel.org
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Reported-by: Luis Henriques <luis.henriques@canonical.com>
Suggested-by: Pádraig Brady <P@draigBrady.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
|
|
Conflicts:
drivers/net/usb/asix_common.c
drivers/net/usb/sr9800.c
drivers/net/usb/usbnet.c
include/linux/usb/usbnet.h
net/ipv4/tcp_ipv4.c
net/ipv6/tcp_ipv6.c
The TCP conflicts were overlapping changes. In 'net' we added a
READ_ONCE() to the socket cached RX route read, whilst in 'net-next'
Eric Dumazet touched the surrounding code dealing with how mini
sockets are handled.
With USB, it's a case of the same bug fix first going into net-next
and then I cherry picked it back into net.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the CC variable instead of hard coding gcc. Also clean up the compiler
options by creating a CFLAGS variable.
Signed-off-by: Tyler Baker <tyler.baker@linaro.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
|
|
Use the CC variable instead of hard coding gcc.
Signed-off-by: Tyler Baker <tyler.baker@linaro.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
|