Age | Commit message (Collapse) | Author | Files | Lines |
|
Cc: stable@kernel.org
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
Multiple virtual counters share one physical counter. The reservation
of virtual counters fails due to duplicate allocation of the same
counter. The counters are already reserved. Thus, virtual counter
reservation may removed at all. This also makes the code easier.
Cc: stable@kernel.org
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
Add Xeon 7500 series support to oprofile.
Straight forward: it's the same as Core i7, so just detect
the model number. No user space changes needed.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
With multiplexing enabled oprofile crashs when profiling more than 28
events. This patch fixes this.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch adds a check for the availability of a counter. A virtual
counter is used only if its physical counter is not reserved.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch implements a common x86 function to convert virtual counter
numbers to physical.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch moves the multiplexing switch counter from x86 code to
common oprofile statistic variables. Now the value will be available
and usable for all architectures. The initialization and
incrementation also moved to common code.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
To setup a counter for all cpus, its structure is cloned from cpu
0. This patch implements mux_clone() to do this part for multiplexing
data.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch checks if the model supports multiplexing. Only then
multiplexing will be enabled. The code is added to the common x86
initialization.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
The check is used to prevent running multiplexing code for models not
supporting multiplexing. Before, the code was running but without
effect.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
Models that do not yet support counter multiplexing have to setup
num_virt_counters. This patch implements the setup from num_counters
if num_virt_counters is not set. Thus, num_virt_counters must be setup
only for multiplexing support.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch removes the const qualifier from struct
op_x86_model_spec to make model parameters changable.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch moves some code in nmi_int.c to get a single separate
multiplexing code section.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch moves some code in nmi_int.c to get a single separate
multiplexing code section.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch moves some code in nmi_int.c to get a single separate
multiplexing code section.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch implements nmi_setup_mux() and nmi_shutdown_mux() functions
to setup/shutdown multiplexing. Multiplexing code in nmi_int.c is now
much more separated.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This new function translates physical to virtual counter numbers.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
Variable switch_index must be initialized for each cpu. This patch
fixes the initialization by moving it to the per-cpu init function
nmi_cpu_setup().
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
__get_cpu_var() calls smp_processor_id(). When the cpu id is already
known, instead use per_cpu() to avoid generating the id again.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
The number of hardware counters is limited. The multiplexing feature
enables OProfile to gather more events than counters are provided by
the hardware. This is realized by switching between events at an user
specified time interval.
A new file (/dev/oprofile/time_slice) is added for the user to specify
the timer interval in ms. If the number of events to profile is higher
than the number of hardware counters available, the patch will
schedule a work queue that switches the event counter and re-writes
the different sets of values into it. The switching mechanism needs to
be implemented for each architecture to support multiplexing. This
patch only implements AMD CPU support, but multiplexing can be easily
extended for other models and architectures.
There are follow-on patches that rework parts of this patch.
Signed-off-by: Jason Yeh <jason.yeh@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch fixes whitespace changes of code that will be touched in
follow-on patches.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch removes the function nmi_save_registers(). Per-cpu code is
now executed only in the function nmi_cpu_setup(). Also, it renames
the per-cpu function nmi_restore_registers() to
nmi_cpu_restore_registers().
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
Conflicts:
drivers/oprofile/oprofile_stats.c
drivers/usb/otg/Kconfig
drivers/usb/otg/Makefile
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
The short name of the achitecture is 'arch_perfmon'. This patch
changes the kernel parameter to use this name.
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Commit:
e419294 x86/oprofile: moving arch_perfmon counter setup to op_x86_model_spec.init
introduced a bug in the initialization of core_i7 leading to the
incorrect model setup to &op_ppro_spec. This patch fixes this.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
Conflicts:
arch/x86/oprofile/op_model_ppro.c
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
The patch removes some local variables in these functions.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch removes struct op_saved_msr and replaces it by an u64
variable. This makes code easier and it is possible to use 64 bit MSR
functions.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch introduces op_x86_get_ctrl() to calculate the value of the
performance control register. This is generic code usable for all
models. The event and reserved masks are model specific and stored in
struct op_x86_model_spec. 64 bit MSR functions are used now. The patch
removes many hard to read macros used for ctrl calculation.
The function op_x86_get_ctrl() is common code and the first step to
further merge performance counter implementations for x86 models.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
In follow-on patches the setup_ctrs() functions will need data that
describes the model. This patch extends the function argument list to
pass a pointer of the model to these function.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
Conflicts:
arch/x86/kernel/irqinit.c
arch/x86/kernel/irqinit_64.c
arch/x86/kernel/traps.c
arch/x86/mm/fault.c
include/linux/sched.h
kernel/exit.c
|
|
The function arch_perfmon_init() in nmi_int.c is model specific. This
patch moves it to op_model_ppro.c by using the init function pointer
in struct op_x86_model_spec.
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This patch removes module_param oprofile.force_arch_perfmon and
introduces oprofile.cpu_type=archperfmon instead. This new parameter
can be reused for other models and architectures.
Currently only archperfmon is supported.
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
The registers are about the same as other Family 6 CPUs
so we only need to add detection.
I'm not completely happy with calling Nehalem Core i7 because
there will be undoubtedly other Nehalem based CPUs
in the future with different marketing names, but it's
the best we got for now.
Requires updated oprofile userland for the new event files.
If you don't want to update right now you can also use
oprofile.force_arch_perfmon=1 (added in the next patch) with 0.9.4
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
There are no new P4s and the oprofile code knows about all existing
ones, so we don't really need the p4force option anymore.
Remove it.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This re-adds the force_arch_perfmon option that was in the original
arch perfmon patchkit. Originally this was rejected in favour
of a generalized perfmon=name option, but it turned out implementing
the later in a reliable way is hard (and it would have been easy
to crash the kernel if a user gets it wrong)
But now Atom and Core i7 support being readded a user would
need to update their oprofile userland to beyond 0.9.4 to use oprofile again
on Atom or Core i7.
To avoid this problem readd the force_arch_perfmon option.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
With oprofile as a module, and unloaded by profiling script,
both oprofile and kerneltop work fine.. unless you leave kerneltop
running when you start profiling, then you may see badness.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Alan Jenkins wrote:
> This is on an EeePC 701, /proc/cpuinfo as attached.
>
> Is this expected? Will the next release work?
>
> Thanks, Alan
>
> # opcontrol --setup --no-vmlinux
> cpu_type 'unset' is not valid
> you should upgrade oprofile or force the use of timer mode
>
> # opcontrol -v
> opcontrol: oprofile 0.9.4 compiled on Nov 29 2008 22:44:10
>
> # cat /dev/oprofile/cpu_type
> i386/p6
> # uname -r
> 2.6.28-rc6eeepc
Hi Alan,
Looking at the kernel driver code for oprofile it can return the "i386/p6" for
the cpu_type. However, looking at the user-space oprofile code there isn't the
matching entry in libop/op_cpu_type.c or the events/unit_mask files in
events/i386 directory.
The Intel AP-485 says this is a "Intel Pentium M processor model D". Seems like
the oprofile kernel driver should be identifying the processor as "i386/p6_mobile"
The driver identification code doesn't look quite right in nmi_init.c
http://git.kernel.org/?p=linux/kernel/git/sfr/linux-next.git;a=blob;f=arch/x86/oprofile/nmi_int.c;h=022cd41ea9b4106e5884277096e80e9088a7c7a9;hb=HEAD
has:
409 case 10 ... 13:
410 *cpu_type = "i386/p6";
411 break;
Referring to the Intel AP-485:
case 10 and 11 should produce "i386/piii"
case 13 should produce "i386/p6_mobile"
I didn't see anything for case 12.
Something like the attached patch. I don't have a celeron machine to verify that
changes in this area of the kernel fix thing.
-Will
Signed-off-by: William Cohen <wcohen@redhat.com>
Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
Conflicts:
arch/x86/oprofile/op_model_ppro.c
|
|
No functional changes. The intension is to remove static function
declarations.
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
Newer Intel CPUs (Core1+) have support for architectural
events described in CPUID 0xA. See the IA32 SDM Vol3b.18 for details.
The advantage of this is that it can be done without knowing about
the specific CPU, because the CPU describes by itself what
performance events are supported. This is only a fallback
because only a limited set of 6 events are supported.
This allows to do profiling on Nehalem and on Atom systems
(later not tested)
This patch implements support for that in oprofile's Intel
Family 6 profiling module. It also has the advantage of supporting
an arbitary number of events now as reported by the CPU.
Also allow arbitary counter widths >32bit while we're at it.
Requires a patched oprofile userland to support the new
architecture.
v2: update for latest oprofile tree
remove force_arch_perfmon
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
This essentially reverts Linus' earlier 4b9f12a3779c548b68bc9af7d94030868ad3aa1b
commit. Nehalem is not core_2, so it shouldn't be reported as such.
However with the earlier arch perfmon patch it will fall back to
arch perfmon mode now, so there is no need to fake it as core_2.
The only drawback is that Linus will need to patch the arch perfmon
support into his oprofile binary now, but I think he can do that.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into oprofile
Conflicts:
arch/x86/oprofile/nmi_int.c
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Conflicts:
arch/x86/oprofile/nmi_int.c
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Conflicts:
arch/x86/oprofile/nmi_int.c
|
|
Reverting commit 1a960b402a51d80abf54e3f8e4972374ffe5f22d for the main
branch. Multiplexing will be tracked on a separate feature branch.
Conflicts:
arch/x86/oprofile/nmi_int.c
|
|
Add __raw access before setting per cpu variable switch_index, to avoid
the following BUG:
[ 449.166827] BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/6998
[ 449.166848] caller is op_nmi_init+0xf0/0x2b0 [oprofile]
[ 449.166855] Pid: 6998, comm: modprobe Not tainted 2.6.27-rc5-mm1 #29
[ 449.166860] Call Trace:
[ 449.166872] [<ffffffff80362d67>] debug_smp_processor_id+0xd7/0xe0
[ 449.166887] [<ffffffffa00181c0>] op_nmi_init+0xf0/0x2b0 [oprofile]
[ 449.166902] [<ffffffffa0018000>] oprofile_init+0x0/0x60 [oprofile]
[ 449.166915] [<ffffffffa00180a9>] oprofile_arch_init+0x9/0x30 [oprofile]
[ 449.166928] [<ffffffffa001801e>] oprofile_init+0x1e/0x60 [oprofile]
[ 449.166937] [<ffffffff8020903b>] _stext+0x3b/0x160
[ 449.166946] [<ffffffff80477985>] __mutex_unlock_slowpath+0xe5/0x190
[ 449.166955] [<ffffffff80262c1a>] trace_hardirqs_on_caller+0xca/0x140
[ 449.166965] [<ffffffff8026f7fc>] sys_init_module+0xdc/0x210
[ 449.166972] [<ffffffff8020b7cb>] system_call_fastpath+0x16/0x1b
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Acked-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
nmi_shutdown() calls unregister_die_notifier() from an atomic context
after setting preempt_disable() via get_cpu_var():
[ 1049.404154] BUG: scheduling while atomic: oprofiled/7796/0x00000002
[ 1049.404171] INFO: lockdep is turned off.
[ 1049.404176] Modules linked in: oprofile af_packet rfcomm l2cap kvm_intel kvm i915 drm acpi_cpufreq cpufreq_userspace cpufreq_conservative cpufreq_ondemand cpufreq_powersave freq_table container sbs sbshc dm_mod arc4 ecb cryptomgr aead snd_hda_intel crypto_blkcipher snd_pcm_oss crypto_algapi snd_pcm iwlagn iwlcore snd_timer iTCO_wdt led_class btusb iTCO_vendor_support snd psmouse bluetooth mac80211 soundcore cfg80211 snd_page_alloc intel_agp video output button battery ac dcdbas evdev ext3 jbd mbcache sg sd_mod piix ata_piix libata scsi_mod dock tg3 libphy ehci_hcd uhci_hcd usbcore thermal processor fan fuse
[ 1049.404362] Pid: 7796, comm: oprofiled Not tainted 2.6.27-rc5-mm1 #30
[ 1049.404368] Call Trace:
[ 1049.404384] [<ffffffff804769fd>] thread_return+0x4a0/0x7d3
[ 1049.404396] [<ffffffff8026ad92>] generic_exec_single+0x52/0xe0
[ 1049.404405] [<ffffffff8026ae1a>] generic_exec_single+0xda/0xe0
[ 1049.404414] [<ffffffff8026aee3>] smp_call_function_single+0x73/0x150
[ 1049.404423] [<ffffffff804770c5>] schedule_timeout+0x95/0xd0
[ 1049.404430] [<ffffffff80476083>] wait_for_common+0x43/0x180
[ 1049.404438] [<ffffffff80476154>] wait_for_common+0x114/0x180
[ 1049.404448] [<ffffffff80236980>] default_wake_function+0x0/0x10
[ 1049.404457] [<ffffffff8024f810>] synchronize_rcu+0x30/0x40
[ 1049.404463] [<ffffffff8024f890>] wakeme_after_rcu+0x0/0x10
[ 1049.404472] [<ffffffff80479ca0>] _spin_unlock_irqrestore+0x40/0x80
[ 1049.404482] [<ffffffff80256def>] atomic_notifier_chain_unregister+0x3f/0x60
[ 1049.404501] [<ffffffffa03d8801>] nmi_shutdown+0x51/0x90 [oprofile]
[ 1049.404517] [<ffffffffa03d6134>] oprofile_shutdown+0x34/0x70 [oprofile]
[ 1049.404532] [<ffffffffa03d721e>] event_buffer_release+0xe/0x40 [oprofile]
[ 1049.404543] [<ffffffff802bdcdd>] __fput+0xcd/0x240
[ 1049.404551] [<ffffffff802baa74>] filp_close+0x54/0x90
[ 1049.404560] [<ffffffff8023e1d1>] put_files_struct+0xb1/0xd0
[ 1049.404568] [<ffffffff8023f82f>] do_exit+0x18f/0x930
[ 1049.404576] [<ffffffff8020be03>] restore_args+0x0/0x30
[ 1049.404584] [<ffffffff80240006>] do_group_exit+0x36/0xa0
[ 1049.404592] [<ffffffff8020b7cb>] system_call_fastpath+0x16/0x1b
This can be easily triggered with 'opcontrol --shutdown'.
Simply move get_cpu_var() above unregister_die_notifier().
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Acked-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Vegard Nossum reported oprofile + hibernation problems:
> Now some warnings:
>
> ------------[ cut here ]------------
> WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/smp.c:328 s
> mp_call_function_mask+0x194/0x1a0()
The usual problem: the suspend function when interrupts are
already disabled calls smp_call_function which is not allowed with
interrupt off. But at this point all the other CPUs should be already
down anyways, so it should be enough to just drop that.
This patch should fix that problem at least by fixing cpu hotplug&
suspend support.
[ mingo@elte.hu: fixed 5 coding style errors. ]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|