summaryrefslogtreecommitdiffstats
path: root/arch/x86/events/perf_event.h
AgeCommit message (Collapse)AuthorFilesLines
2016-12-11perf/x86: Fix exclusion of BTS and LBR for GoldmontAndi Kleen1-1/+1
An earlier patch allowed enabling PT and LBR at the same time on Goldmont. However it also allowed enabling BTS and LBR at the same time, which is still not supported. Fix this by bypassing the check only for PT. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: alexander.shishkin@intel.com Cc: kan.liang@intel.com Cc: <stable@vger.kernel.org> Fixes: ccbebba4c6bf ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it") Link: http://lkml.kernel.org/r/20161209001417.4713-1-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22perf/x86/intel: Cure bogus unwind from PEBS entriesPeter Zijlstra1-1/+1
Vince Weaver reported that perf_fuzzer + KASAN detects that PEBS event unwinds sometimes do 'weird' things. In particular, we seemed to be ending up unwinding from random places on the NMI stack. While it was somewhat expected that the event record BP,SP would not match the interrupt BP,SP in that the interrupt is strictly later than the record event, it was overlooked that it could be on an already overwritten stack. Therefore, don't copy the recorded BP,SP over the interrupted BP,SP when we need stack unwinds. Note that its still possible the unwind doesn't full match the actual event, as its entirely possible to have done an (I)RET between record and interrupt, but on average it should still point in the general direction of where the event came from. Also, it's the best we can do, considering. The particular scenario that triggered the bogus NMI stack unwind was a PEBS event with very short period, upon enabling the event at the tail of the PMI handler (FREEZE_ON_PMI is not used), it instantly triggers a record (while still on the NMI stack) which in turn triggers the next PMI. This then causes back-to-back NMIs and we'll try and unwind the stack-frame from the last NMI, which obviously is now overwritten by our own. Analyzed-by: Josh Poimboeuf <jpoimboe@redhat.com> Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davej@codemonkey.org.uk <davej@codemonkey.org.uk> Cc: dvyukov@google.com <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: ca037701a025 ("perf, x86: Add PEBS infrastructure") Link: http://lkml.kernel.org/r/20161117171731.GV3157@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10perf/x86/intel: Clean up LBR state trackingPeter Zijlstra1-1/+0
The lbr_context logic confused me; it appears to me to try and do the same thing the pmu::sched_task() callback does now, but limited to per-task events. So rip it out. Afaict this should also improve performance, because I think the current code can end up doing lbr_reset() twice, once from the pmu::add() and then again from pmu::sched_task(), and MSR writes (all 3*16 of them) are expensive!! While thinking through the cases that need the reset it occured to me the first install of an event in an active context needs to reset the LBR (who knows what crap is in there), but detecting this case is somewhat hard. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10perf/x86: Ensure perf_sched_cb_{inc,dec}() is only called from pmu::{add,del}()Peter Zijlstra1-2/+8
Currently perf_sched_cb_{inc,dec}() are called from pmu::{start,stop}(), which has the problem that this can happen from NMI context, this is making it hard to optimize perf_pmu_sched_task(). Furthermore, we really only need this accounting on pmu::{add,del}(), so doing it from pmu::{start,stop}() is doing more work than we really need. Introduce x86_pmu::{add,del}() and wire up the LBR and PEBS. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10perf/x86/intel: Rework the large PEBS setup codePeter Zijlstra1-0/+2
In order to allow optimizing perf_pmu_sched_task() we must ensure perf_sched_cb_{inc,dec}() are no longer called from NMI context; this means that pmu::{start,stop}() can no longer use them. Prepare for this by reworking the whole large PEBS setup code. The current code relied on the cpuc->pebs_enabled state, however since that reflects the current active state as per pmu::{start,stop}() we can no longer rely on this. Introduce two counters: cpuc->n_pebs and cpuc->n_large_pebs which count the total number of PEBS events and the number of PEBS events that have FREERUNNING set, resp.. With this we can tell if the current setup requires a single record interrupt threshold or can use a larger buffer. This also improves the code in that it re-enables the large threshold once the PEBS event that required single record gets removed. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27perf/x86/intel: Fix MSR_LAST_BRANCH_FROM_x bug when no TSXDavid Carrillo-Cisneros1-0/+2
Intel's SDM states that bits 61:62 in MSR_LAST_BRANCH_FROM_x are the TSX flags for formats with LBR_TSX flags (i.e. LBR_FORMAT_EIP_EFLAGS2). However, when the CPU has TSX support deactivated, bits 61:62 actually behave as follows: - For wrmsr(), bits 61:62 are considered part of the sign extension. - When capturing branches, the LBR hw will always clear bits 61:62. regardless of the sign extension. Therefore, if: 1) LBR has TSX format. 2) CPU has no TSX support enabled. ... then any value passed to wrmsr() must be sign extended to 63 bits and any value from rdmsr() must be converted to have a sign extension of 61 bits, ignoring the values at TSX flags. This bug was masked by the work-around to the Intel's CPU bug: BJ94. "LBR May Contain Incorrect Information When Using FREEZE_LBRS_ON_PMI" in Document Number: 324643-037US. The aforementioned work-around uses hw flags to filter out all kernel branches, limiting LBR callstack to user level execution only. Since user addresses are not sign extended, they do not trigger the wrmsr() bug in MSR_LAST_BRANCH_FROM_x when saved/restored at context switch. To verify the hw bug: $ perf record -b -e cycles sleep 1 $ rdmsr -p 0 0x680 0x1fffffffb0b9b0cc $ wrmsr -p 0 0x680 0x1fffffffb0b9b0cc write(): Input/output error The quirk for LBR_FROM_ MSRs is required before calls to wrmsrl() and after rdmsrl(). This patch introduces it for wrmsrl()'s done for testing LBR support. Future patch in series adds the quirk for context switch, that would be required if LBR callstack is to be enabled for ring 0. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1466533874-52003-3-git-send-email-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03perf/x86: Support sysfs files depending on SMT statusAndi Kleen1-0/+10
Add a way to show different sysfs events attributes depending on HyperThreading is on or off. This is difficult to determine early at boot, so we just do it dynamically when the sysfs attribute is read. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1463703002-19686-3-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports itAlexander Shishkin1-0/+1
Not all cores prevent using Intel PT and LBRs simultaneously, although most of them still do as of today. This patch adds an opt-in flag for such cores to disable mutual exclusivity between PT and LBR; also flip it on for Goldmont. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461857746-31346-4-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23perf/x86/intel: Add LBR filter support for Silvermont and Airmont CPUsKan Liang1-0/+2
LBR filtering is also supported on the Silvermont and Airmont microarchitectures. The layout of MSR_LBR_SELECT is the same as Nehalem. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1460706825-46163-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23perf/x86/intel: Add Goldmont CPU supportKan Liang1-0/+2
Add perf core PMU support for Intel Goldmont CPU cores: - The init code is based on Silvermont. - There is a new cache event list, based on the Silvermont cache event list. - Goldmont has 32 LBR entries. It also uses new LBRv6 format, which report the cycle information using upper 16-bit of the LBR_TO. - It's recommended to use CPU_CLK_UNHALTED.CORE_P + NPEBS for precise cycles. For details, please refer to the latest SDM058: http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.pdf Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1460706167-45320-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-03Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc kernel side fixes: - fix event leak - fix AMD PMU driver bug - fix core event handling bug - fix build bug on certain randconfigs Plus misc tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/amd/ibs: Fix pmu::stop() nesting perf/core: Don't leak event in the syscall error path perf/core: Fix time tracking bug with multiplexing perf jit: genelf makes assumptions about endian perf hists: Fix determination of a callchain node's childlessness perf tools: Add missing initialization of perf_sample.cpumode in synthesized samples perf tools: Fix build break on powerpc perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL perf bench: Fix detached tarball building due to missing 'perf bench memcpy' headers perf tests: Fix tarpkg build test error output redirection
2016-03-29perf/x86/amd: Cleanup Fam10h NB event constraintsPeter Zijlstra1-0/+5
Avoid allocating the AMD NB event constraints data structure when not needed. This gets rid of x86_max_cores usage and avoids allocating this on AMD Core Perfctr supporting hardware (which has separate MSRs for NB events). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: aherrmann@suse.com Cc: Rui Huang <ray.huang@amd.com> Cc: Borislav Petkov <bp@alien8.de> Cc: jencce.kernel@gmail.com Link: http://lkml.kernel.org/r/20160320124629.GY6375@twins.programming.kicks-ass.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-25perf/x86: Move events_sysfs_show() outside CPU_SUP_INTELHuang Rui1-3/+3
randconfig builds can sometimes disable CONFIG_CPU_SUP_INTEL while enabling the AMD power reporting PMU driver, resulting in this build failure: arch/x86/kernel/cpu/perf_event.h:663:31: error: 'events_sysfs_show' undeclared here (not in a function) To fix it, move events_sysfs_show() outside of #ifdef CONFIG_CPU_SUP_INTEL. Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: build test robot <lkp@intel.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sherry Hurwitz <sherry.hurwitz@amd.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: kbuild-all@01.org Cc: linux-next@vger.kernel.org Cc: spg_linux_kernel@amd.com Link: http://lkml.kernel.org/r/1458875905-4278-1-git-send-email-ray.huang@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-17Merge branch 'x86/cleanups' into x86/urgentIngo Molnar1-1/+1
Pull in some merge window leftovers. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08perf/x86/intel: Fix PEBS data source interpretation on Nehalem/WestmereAndi Kleen1-0/+2
Jiri reported some time ago that some entries in the PEBS data source table in perf do not agree with the SDM. We investigated and the bits changed for Sandy Bridge, but the SDM was not updated. perf already implements the bits correctly for Sandy Bridge and later. This patch patches it up for Nehalem and Westmere. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1456871124-15985-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08perf/x86/pebs: Add proper PEBS constraints for BroadwellStephane Eranian1-0/+2
This patch adds a Broadwell specific PEBS event constraint table. Broadwell has a fix for the HT corruption bug erratum HSD29 on Haswell. Therefore, there is no need to mark events 0xd0, 0xd1, 0xd2, 0xd3 has requiring the exclusive mode across both sibling HT threads. This holds true for regular counting and sampling (see core.c) and PEBS (ds.c) which we fix in this patch. In doing so, we relax evnt scheduling for these events, they can now be programmed on any 4 counters without impacting what is measured on the sibling thread. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: adrian.hunter@intel.com Cc: jolsa@redhat.com Cc: kan.liang@intel.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/1457034642-21837-4-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08perf/x86/intel: Use PAGE_SIZE for PEBS buffer size on Core2Jiri Olsa1-0/+1
Using PAGE_SIZE buffers makes the WRMSR to PERF_GLOBAL_CTRL in intel_pmu_enable_all() mysteriously hang on Core2. As a workaround, we don't do this. The hard lockup is easily triggered by running 'perf test attr' repeatedly. Most of the time it gets stuck on sample session with small periods. # perf test attr -vv 14: struct perf_event_attr setup : --- start --- ... 'PERF_TEST_ATTR=/tmp/tmpuEKz3B /usr/bin/perf record -o /tmp/tmpuEKz3B/perf.data -c 123 kill >/dev/null 2>&1' ret 1 Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20160301190352.GA8355@krava.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event.h to its new homeBorislav Petkov1-0/+955
Now that all functionality has been moved to arch/x86/events/, move the perf_event.h header and adjust include paths. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-18-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>