summaryrefslogtreecommitdiffstats
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2014-04-01Merge tag 'pm+acpi-3.15-rc1' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "The majority of this material spent some time in linux-next, some of it even several weeks. There are a few relatively fresh commits in it, but they are mostly fixes and simple cleanups. ACPI took the lead this time, both in terms of the number of commits and the number of modified lines of code, cpufreq follows and there are a few changes in the PM core and in cpuidle too. A new feature that already got some LWN.net's attention is the device PM QoS extension allowing latency tolerance requirements to be propagated from leaf devices to their ancestors with hardware interfaces for specifying latency tolerance. That should help systems with hardware-driven power management to avoid going too far with it in cases when there are latency tolerance constraints. There also are some significant changes in the ACPI core related to the way in which hotplug notifications are handled. They affect PCI hotplug (ACPIPHP) and the ACPI dock station code too. The bottom line is that all those notification now go through the root notify handler and are propagated to the interested subsystems by means of callbacks instead of having to install a notify handler for each device object that we can potentially get hotplug notifications for. In addition to that ACPICA will now advertise "Windows 2013" compatibility for _OSI, because some systems out there don't work correctly if that is not done (some of them don't even boot). On the system suspend side of things, all of the device suspend and resume callbacks, except for ->prepare() and ->complete(), are now going to be executed asynchronously as that turns out to speed up system suspend and resume on some platforms quite significantly and we have a few more optimizations in that area. Apart from that, there are some new device IDs and fixes and cleanups all over. In particular, the system suspend and resume handling by cpufreq should be improved and the cpuidle menu governor should be a bit more robust now. Specifics: - Device PM QoS support for latency tolerance constraints on systems with hardware interfaces allowing such constraints to be specified. That is necessary to prevent hardware-driven power management from becoming overly aggressive on some systems and to prevent power management features leading to excessive latencies from being used in some cases. - Consolidation of the handling of ACPI hotplug notifications for device objects. This causes all device hotplug notifications to go through the root notify handler (that was executed for all of them anyway before) that propagates them to individual subsystems, if necessary, by executing callbacks provided by those subsystems (those callbacks are associated with struct acpi_device objects during device enumeration). As a result, the code in question becomes both smaller in size and more straightforward and all of those changes should not affect users. - ACPICA update, including fixes related to the handling of _PRT in cases when it is broken and the addition of "Windows 2013" to the list of supported "features" for _OSI (which is necessary to support systems that work incorrectly or don't even boot without it). Changes from Bob Moore and Lv Zheng. - Consolidation of ACPI _OST handling from Jiang Liu. - ACPI battery and AC fixes allowing unusual system configurations to be handled by that code from Alexander Mezin. - New device IDs for the ACPI LPSS driver from Chiau Ee Chew. - ACPI fan and thermal optimizations related to system suspend and resume from Aaron Lu. - Cleanups related to ACPI video from Jean Delvare. - Assorted ACPI fixes and cleanups from Al Stone, Hanjun Guo, Lan Tianyu, Paul Bolle, Tomasz Nowicki. - Intel RAPL (Running Average Power Limits) driver cleanups from Jacob Pan. - intel_pstate fixes and cleanups from Dirk Brandewie. - cpufreq fixes related to system suspend/resume handling from Viresh Kumar. - cpufreq core fixes and cleanups from Viresh Kumar, Stratos Karafotis, Saravana Kannan, Rashika Kheria, Joe Perches. - cpufreq drivers updates from Viresh Kumar, Zhuoyu Zhang, Rob Herring. - cpuidle fixes related to the menu governor from Tuukka Tikkanen. - cpuidle fix related to coupled CPUs handling from Paul Burton. - Asynchronous execution of all device suspend and resume callbacks, except for ->prepare and ->complete, during system suspend and resume from Chuansheng Liu. - Delayed resuming of runtime-suspended devices during system suspend for the PCI bus type and ACPI PM domain. - New set of PM helper routines to allow device runtime PM callbacks to be used during system suspend and resume more easily from Ulf Hansson. - Assorted fixes and cleanups in the PM core from Geert Uytterhoeven, Prabhakar Lad, Philipp Zabel, Rashika Kheria, Sebastian Capella. - devfreq fix from Saravana Kannan" * tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits) PM / devfreq: Rewrite devfreq_update_status() to fix multiple bugs PM / sleep: Correct whitespace errors in <linux/pm.h> intel_pstate: Set core to min P state during core offline cpufreq: Add stop CPU callback to cpufreq_driver interface cpufreq: Remove unnecessary braces cpufreq: Fix checkpatch errors and warnings cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs MAINTAINERS: Reorder maintainer addresses for PM and ACPI PM / Runtime: Update runtime_idle() documentation for return value meaning video / output: Drop display output class support fujitsu-laptop: Drop unneeded include acer-wmi: Stop selecting VIDEO_OUTPUT_CONTROL ACPI / gpu / drm: Stop selecting VIDEO_OUTPUT_CONTROL ACPI / video: fix ACPI_VIDEO dependencies cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE|RESUMECHANGE} cpufreq: Do not allow ->setpolicy drivers to provide ->target cpufreq: arm_big_little: set 'physical_cluster' for each CPU cpufreq: arm_big_little: make vexpress driver depend on bL core driver ACPI / button: Add ACPI Button event via netlink routine ACPI: Remove duplicate definitions of PREFIX ...
2014-04-01Merge branch 'irq-core-for-linus' of ↵Linus Torvalds3-18/+25
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq code updates from Thomas Gleixner: "The irq department proudly presents: - Another tree wide sweep of irq infrastructure abuse. Clear winner of the trainwreck engineering contest was: #include "../../../kernel/irq/settings.h" - Tree wide update of irq_set_affinity() callbacks which miss a cpu online check when picking a single cpu out of the affinity mask. - Tree wide consolidation of interrupt statistics. - Updates to the threaded interrupt infrastructure to allow explicit wakeup of the interrupt thread and a variant of synchronize_irq() which synchronizes only the hard interrupt handler. Both are needed to replace the homebrewn thread handling in the mmc/sdhci code. - New irq chip callbacks to allow proper support for GPIO based irqs. The GPIO based interrupts need to request/release GPIO resources from request/free_irq. - A few new ARM interrupt chips. No revolutionary new hardware, just differently wreckaged variations of the scheme. - Small improvments, cleanups and updates all over the place" I was hoping that that trainwreck engineering contest was a April Fools' joke. But no. * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits) irqchip: sun7i/sun6i: Disable NMI before registering the handler ARM: sun7i/sun6i: dts: Fix IRQ number for sun6i NMI controller ARM: sun7i/sun6i: irqchip: Update the documentation ARM: sun7i/sun6i: dts: Add NMI irqchip support ARM: sun7i/sun6i: irqchip: Add irqchip driver for NMI controller genirq: Export symbol no_action() arm: omap: Fix typo in ams-delta-fiq.c m68k: atari: Fix the last kernel_stat.h fallout irqchip: sun4i: Simplify sun4i_irq_ack irqchip: sun4i: Use handle_fasteoi_irq for all interrupts genirq: procfs: Make smp_affinity values go+r softirq: Add linux/irq.h to make it compile again m68k: amiga: Add linux/irq.h to make it compile again irqchip: sun4i: Don't ack IRQs > 0, fix acking of IRQ 0 irqchip: sun4i: Fix a comment about mask register initialization irqchip: sun4i: Fix irq 0 not working genirq: Add a new IRQCHIP_EOI_THREADED flag genirq: Document IRQCHIP_ONESHOT_SAFE flag ARM: sunxi: dt: Convert to the new irq controller compatibles irqchip: sunxi: Change compatibles ...
2014-03-31Merge branch 'sched-core-for-linus' of ↵Linus Torvalds4-34/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "Bigger changes: - sched/idle restructuring: they are WIP preparation for deeper integration between the scheduler and idle state selection, by Nicolas Pitre. - add NUMA scheduling pseudo-interleaving, by Rik van Riel. - optimize cgroup context switches, by Peter Zijlstra. - RT scheduling enhancements, by Thomas Gleixner. The rest is smaller changes, non-urgnt fixes and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits) sched: Clean up the task_hot() function sched: Remove double calculation in fix_small_imbalance() sched: Fix broken setscheduler() sparc64, sched: Remove unused sparc64_multi_core sched: Remove unused mc_capable() and smt_capable() sched/numa: Move task_numa_free() to __put_task_struct() sched/fair: Fix endless loop in idle_balance() sched/core: Fix endless loop in pick_next_task() sched/fair: Push down check for high priority class task into idle_balance() sched/rt: Fix picking RT and DL tasks from empty queue trace: Replace hardcoding of 19 with MAX_NICE sched: Guarantee task priority in pick_next_task() sched/idle: Remove stale old file sched: Put rq's sched_avg under CONFIG_FAIR_GROUP_SCHED cpuidle/arm64: Remove redundant cpuidle_idle_call() cpuidle/powernv: Remove redundant cpuidle_idle_call() sched, nohz: Exclude isolated cores from load balancing sched: Fix select_task_rq_fair() description comments workqueue: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE sys: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE ...
2014-03-31Merge branch 'core-locking-for-linus' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking updates from Ingo Molnar: "The biggest change is the MCS spinlock generalization changes from Tim Chen, Peter Zijlstra, Jason Low et al. There's also lockdep fixes/enhancements from Oleg Nesterov, in particular a false negative fix related to lockdep_set_novalidate_class() usage" * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) locking/mutex: Fix debug checks locking/mutexes: Add extra reschedule point locking/mutexes: Introduce cancelable MCS lock for adaptive spinning locking/mutexes: Unlock the mutex without the wait_lock locking/mutexes: Modify the way optimistic spinners are queued locking/mutexes: Return false if task need_resched() in mutex_can_spin_on_owner() locking: Move mcs_spinlock.h into kernel/locking/ m68k: Skip futex_atomic_cmpxchg_inatomic() test futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test Revert "sched/wait: Suppress Sparse 'variable shadowing' warning" lockdep: Change lockdep_set_novalidate_class() to use _and_name lockdep: Change mark_held_locks() to check hlock->check instead of lockdep_no_validate lockdep: Don't create the wrong dependency on hlock->check == 0 lockdep: Make held_lock->check and "int check" argument bool locking/mcs: Allow architecture specific asm files to be used for contended case locking/mcs: Order the header files in Kbuild of each architecture in alphabetical order sched/wait: Suppress Sparse 'variable shadowing' warning hung_task/Documentation: Fix hung_task_warnings description locking/mcs: Allow architectures to hook in to contended paths locking/mcs: Micro-optimize the MCS code, add extra comments ...
2014-03-31Merge remote-tracking branch 'robh/for-next' into devicetree/nextGrant Likely2-0/+10
2014-03-31net: filter: add jited flag to indicate jit compiled filtersDaniel Borkmann1-1/+2
This patch adds a jited flag into sk_filter struct in order to indicate whether a filter is currently jited or not. The size of sk_filter is not being expanded as the 32 bit 'len' member allows upper bits to be reused since a filter can currently only grow as large as BPF_MAXINSNS. Therefore, there's enough room also for other in future needed flags to reuse 'len' field if necessary. The jited flag also allows for having alternative interpreter functions running as currently, we can only detect jit compiled filters by testing fp->bpf_func to not equal the address of sk_run_filter(). Joint work with Alexei Starovoitov. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-29Merge branch 'kvm-ppchv-next' of ↵Paolo Bonzini13-67/+377
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-next
2014-03-29KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8Paul Mackerras3-1/+33
Currently we save the host PMU configuration, counter values, etc., when entering a guest, and restore it on return from the guest. (We have to do this because the guest has control of the PMU while it is executing.) However, we missed saving/restoring the SIAR and SDAR registers, as well as the registers which are new on POWER8, namely SIER and MMCR2. This adds code to save the values of these registers when entering the guest and restore them on exit. This also works around the bug in POWER8 where setting PMAE with a counter already negative doesn't generate an interrupt. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offsetPaul Mackerras1-1/+9
Commit c7699822bc21 ("KVM: PPC: Book3S HV: Make physical thread 0 do the MMU switching") reordered the guest entry/exit code so that most of the guest register save/restore code happened in guest MMU context. A side effect of that is that the timebase still contains the guest timebase value at the point where we compute and use vcpu->arch.dec_expires, and therefore that is now a guest timebase value rather than a host timebase value. That in turn means that the timeouts computed in kvmppc_set_timer() are wrong if the timebase offset for the guest is non-zero. The consequence of that is things such as "sleep 1" in a guest after migration may sleep for much longer than they should. This fixes the problem by converting between guest and host timebase values as necessary, by adding or subtracting the timebase offset. This also fixes an incorrect comment. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29KVM: PPC: Book3S HV: Don't use kvm_memslots() in real modePaul Mackerras2-3/+15
With HV KVM, some high-frequency hypercalls such as H_ENTER are handled in real mode, and need to access the memslots array for the guest. Accessing the memslots array is safe, because we hold the SRCU read lock for the whole time that a guest vcpu is running. However, the checks that kvm_memslots() does when lockdep is enabled are potentially unsafe in real mode, when only the linear mapping is available. Furthermore, kvm_memslots() can be called from a secondary CPU thread, which is an offline CPU from the point of view of the host kernel, and is not running the task which holds the SRCU read lock. To avoid false positives in the checks in kvm_memslots(), and to avoid possible side effects from doing the checks in real mode, this replaces kvm_memslots() with kvm_memslots_raw() in all the places that execute in real mode. kvm_memslots_raw() is a new function that is like kvm_memslots() but uses rcu_dereference_raw_notrace() instead of kvm_dereference_check(). Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29KVM: PPC: Book3S HV: Return ENODEV error rather than EIOPaul Mackerras1-1/+1
If an attempt is made to load the kvm-hv module on a machine which doesn't have hypervisor mode available, return an ENODEV error, which is the conventional thing to return to indicate that this module is not applicable to the hardware of the current machine, rather than EIO, which causes a warning to be printed. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS codePaul Mackerras1-2/+5
The in-kernel emulation of RTAS functions needs to read the argument buffer from guest memory in order to find out what function is being requested. The guest supplies the guest physical address of the buffer, and on a real system the code that reads that buffer would run in guest real mode. In guest real mode, the processor ignores the top 4 bits of the address specified in load and store instructions. In order to emulate that behaviour correctly, we need to mask off those bits before calling kvm_read_guest() or kvm_write_guest(). This adds that masking. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29KVM: PPC: Book3S HV: Add get/set_one_reg for new TM stateMichael Neuling1-22/+125
This adds code to get/set_one_reg to read and write the new transactional memory (TM) state. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29KVM: PPC: Book3S HV: Add transactional memory supportMichael Neuling4-30/+149
This adds saving of the transactional memory (TM) checkpointed state on guest entry and exit. We only do this if we see that the guest has an active transaction. It also adds emulation of the TM state changes when delivering IRQs into the guest. According to the architecture, if we are transactional when an IRQ occurs, the TM state is changed to suspended, otherwise it's left unchanged. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-26net: Rename skb->rxhash to skb->hashTom Herbert1-2/+2
The packet hash can be considered a property of the packet, not just on RX path. This patch changes name of rxhash and l4_rxhash skbuff fields to be hash and l4_hash respectively. This includes changing uses of the field in the code which don't call the access functions. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-26KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=nAnton Blanchard1-3/+1
I noticed KVM is broken when KVM in-kernel XICS emulation (CONFIG_KVM_XICS) is disabled. The problem was introduced in 48eaef05 (KVM: PPC: Book3S HV: use xics_wake_cpu only when defined). It used CONFIG_KVM_XICS to wrap xics_wake_cpu, where CONFIG_PPC_ICP_NATIVE should have been used. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@vger.kernel.org Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-26KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCELaurent Dufour3-1/+31
This introduces the H_GET_TCE hypervisor call, which is basically the reverse of H_PUT_TCE, as defined in the Power Architecture Platform Requirements (PAPR). The hcall H_GET_TCE is required by the kdump kernel, which uses it to retrieve TCEs set up by the previous (panicked) kernel. Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2014-03-26KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd writeGreg Kurz2-3/+8
When the guest does an MMIO write which is handled successfully by an ioeventfd, ioeventfd_write() returns 0 (success) and kvmppc_handle_store() returns EMULATE_DONE. Then kvmppc_emulate_mmio() converts EMULATE_DONE to RESUME_GUEST_NV and this causes an exit from the loop in kvmppc_vcpu_run_hv(), causing an exit back to userspace with a bogus exit reason code, typically causing userspace (e.g. qemu) to crash with a message about an unknown exit code. This adds handling of RESUME_GUEST_NV in kvmppc_vcpu_run_hv() in order to fix that. For generality, we define a helper to check for either of the return-to-guest codes we use, RESUME_GUEST and RESUME_GUEST_NV, to make it easy to check for either and provide one place to update if any other return-to-guest code gets defined in future. Since it only affects Book3S HV for now, the helper is added to the kvm_book3s.h header file. We use the helper in two places in kvmppc_run_core() as well for future-proofing, though we don't see RESUME_GUEST_NV in either place at present. [paulus@samba.org - combined 4 patches into one, rewrote description] Suggested-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2014-03-24Merge remote-tracking branch 'scott/next' into nextBenjamin Herrenschmidt57-210/+1259
Freescale updates from Scott. Mostly support for critical and machine check exceptions on 64-bit BookE, some new PCI suspend/resume work and misc bits.
2014-03-24powerpc/book3s: Fix CFAR clobbering issue in machine check handler.Mahesh Salgaonkar2-0/+13
While checking powersaving mode in machine check handler at 0x200, we clobber CFAR register. Fix it by saving and restoring it during beq/bgt. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/compat: 32-bit little endian machine name is ppcle, not ppcAnton Blanchard1-0/+4
I noticed this when testing setarch. No, we don't magically support a big endian userspace on a little endian kernel. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/le: Big endian arguments for ppc_rtas()Greg Kurz1-9/+13
The ppc_rtas() syscall allows userspace to interact directly with RTAS. For the moment, it assumes every thing is big endian and returns either EINVAL or EFAULT when called in a little endian environment. As suggested by Benjamin, to avoid bugs when userspace wants to pass a non 32 bit value to RTAS, it is far better to stick with a simple rationale: ppc_rtas() should be called with a big endian rtas_args structure. With this patch, it is now up to userspace to forge big endian arguments, as expected by RTAS. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc: Use default set of netfilter modules (CONFIG_NETFILTER_ADVANCED=n)Anton Blanchard4-238/+8
Our netfilter options are stale and important things like masquerading are no longer enabled. Instead of trying to keep up with any updates, set CONFIG_NETFILTER_ADVANCED=n on ppc64* and pseries* defconfigs. This enables the most common netfilter modules for us. While here, enable the network bridge module which is heavily used in KVM setups. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/defconfigs: Enable THP in pseries defconfigAneesh Kumar K.V1-0/+2
We also set it to be enabled always. This helps in wider testing Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/mm: Make sure a local_irq_disable prevent a parallel THP splitAneesh Kumar K.V1-0/+5
We have generic code like the one in get_futex_key that assume that a local_irq_disable prevents a parallel THP split. Support that by adding a dummy smp call function after setting _PAGE_SPLITTING. Code paths like get_user_pages_fast still need to check for _PAGE_SPLITTING after disabling IRQ which indicate that a parallel THP splitting is ongoing. Now if they don't find _PAGE_SPLITTING set, then we can be sure that parallel split will now block in pmdp_splitting flush until we enables IRQ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc: Rate-limit users spamming kernel log bufferMichael Neuling1-2/+3
The facility unavailable exception can be triggered from userspace by accessing PMU registers when EBB is not enabled. This causes the included pr_err() to run, hence spamming the kernel log buffer. This avoids this by rate limiting these messages. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Fix handling of L3 events with bank == 1Michael Ellerman1-2/+3
Currently we reject events which have the L3 bank == 1, such as 0x000084918F, because the cache field is non-zero. However that is incorrect, because although the bank is non-zero, the value we would write into MMCRC is zero, and so we can count the event. So fix the check to ignore the bank selector when checking whether the cache selector is non-zero. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Add kconfig option for hypervisor provided countersCody P Schafer2-0/+14
The commit adds a Kconfig option which allows the hv_gpci and hv_24x7 PMUs, added in the preceeding commits, to be built. Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Add support for the hv 24x7 interfaceCody P Schafer1-0/+510
This provides a basic interface between hv_24x7 and perf. Similar to the one provided for gpci, it lacks transaction support and does not list any events. Example usage via perf tool: perf stat -e 'hv_24x7/domain=2,offset=8,starting_index=0,lpar=0xffffffff/' -r 0 -C 0 -x ' ' sleep 0.1 Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Add support for the hv gpci (get performance counter info) ↵Cody P Schafer1-0/+294
interface This provides a basic link between perf and hv_gpci. Notably, it does not yet support transactions and does not list any events (they can still be manually composed). Example usage via perf tool: perf stat -e 'hv_gpci/counter_info_version=3,offset=0,length=8,secondary_index=0,starting_index=0xffffffff,request=0x10/' -r 0 -C 0 -x ' ' sleep 0.1 Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Add macros for defining event fields & formatsCody P Schafer1-0/+19
Add two macros which generate functions to extract the relevent bits from event->attr.config{,1,2}. EVENT_DEFINE_RANGE() defines an accessor for a range of bits in the event, as well as a "max" function that gives the maximum value of the field based on the bit width. EVENT_DEFINE_RANGE_FORMAT() defines the accessor & max routine and also a format attribute for use in the PMU's attr_groups. Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> [mpe: move to powerpc, ugly but descriptive macro names] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Add a shared interface to get gpci version and capabilitiesCody P Schafer2-0/+56
This exposes a simple way to grab the firmware provided collect_priveliged, ga, expanded, and lab capability bits. All of these bits come in from the same gpci request, so we've exposed all of them. Only the collect_priveliged bit is really used by the hv-gpci/hv-24x7 code, the other bits are simply exposed in sysfs to inform the user. Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Add 24x7 interface headersCody P Schafer2-0/+142
24x7 (also called hv_24x7 or H_24X7) is an interface to obtain performance counters from the hypervisor. These counters do not have a fixed format/possition and are instead documented in a "24x7 Catalog", which is provided by the hypervisor (that interface is also documented paritialy in the included hv-24x7-catalog.h and fully in at https://raw.githubusercontent.com/jmesmon/catalog-24x7/master/hv-24x7-catalog.h ). The 24x7 data access is simply a copy operation into a 4 dimentional array of 64bit counters (from hypervisor to kernel memory). There is no interupt triggered on overflow, these are completely disjoint from the typical power pmu. This method of obtaining performance counters from the hypervisor is intended to paritialy replace the gpci interface. Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Add hv_gpci interface headerCody P Schafer1-0/+73
"H_GetPerformanceCounterInfo" (refered to as hv_gpci or just gpci from here on) is an interface to retrieve specific performance counters and other data from the hypervisor. All outputs have a fixed format. This header only describes the portions of the interface that we plan on using in linux at this time. Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc: Add hvcalls for 24x7 and gpci (Get Performance Counter Info)Cody P Schafer1-0/+5
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Enable BHRB access for EBB eventsMichael Ellerman2-4/+7
The previous commit added constraint and register handling to allow processes using EBB (Event Based Branches) to request access to the BHRB (Branch History Rolling Buffer). With that in place we can allow processes using EBB to access the BHRB. This is achieved by setting BHRBA in MMCR0 when we enable EBB access. We must also clear BHRBA when we are disabling. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Add BHRB constraint and IFM MMCRA handling for EBBMichael Ellerman1-9/+44
We want a way for users of EBB (Event Based Branches) to also access the BHRB (Branch History Rolling Buffer). EBB does not interoperate with our existing BHRB support, which is wired into the generic Linux branch stack sampling support. To support EBB & BHRB we add three new bits to the event code. The first bit indicates that the event wants access to the BHRB, and the other two bits indicate the desired IFM (Instruction Filtering Mode). We allow multiple events to request access to the BHRB, but they must agree on the IFM value. Events which are not interested in the BHRB can also interoperate with events which do. Finally we program the desired IFM value into MMCRA. Although we do this for every event, we know that the value will be identical for all events that request BHRB access. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Avoid mutating event in power8_get_constraint()Michael Ellerman1-6/+8
We only need to mask the EBB bit out of the event for the check of the special PMC 5 & 6 events. So use a local to do it just for that code, rather than changing the event value for the life of the function. While we're there move the set of mask and value after all the checks. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Clean up the EBB hash defines a littleMichael Ellerman1-3/+4
Rather than using PERF_EVENT_CONFIG_EBB_SHIFT everywhere, add an EVENT_EBB_SHIFT like every other event and use that. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Reject EBB events which specify a sample_typeMichael Ellerman1-2/+5
Although we already block EBB events which request sampling using sample_period, technically it's possible for an event to set sample_type but not sample_period. Nothing terrible will happen if an EBB event does specify sample_type, but it signals a major confusion on the part of userspace, and so we do them the favor of rejecting it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Add lost exception workaroundMichael Ellerman3-2/+105
Some power8 revisions have a hardware bug where we can lose a PMU exception, this commit adds a workaround to detect the bad condition and rectify the situation. See the comment in the commit for a full description. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc: Add a cpu feature CPU_FTR_PMAO_BUGMichael Ellerman2-3/+5
Some power8 revisions have a hardware bug where we can lose a Performance Monitor (PMU) exception under certain circumstances. We will be adding a workaround for this case, see the next commit for details. The observed behaviour is that writing PMAO doesn't cause an exception as we would expect, hence the name of the feature. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Define perf_event_print_debug() to print PMU register valuesAnshuman Khandual2-4/+52
Currently the sysrq ShowRegs command does not print any PMU registers as we have an empty definition for perf_event_print_debug(). This patch defines perf_event_print_debug() to print various PMU registers. Example output: CPU: 0 PMU registers, ppmu = POWER7 n_counters = 6 PMC1: 00000000 PMC2: 00000000 PMC3: 00000000 PMC4: 00000000 PMC5: 00000000 PMC6: 00000000 PMC7: deadbeef PMC8: deadbeef MMCR0: 0000000080000000 MMCR1: 0000000000000000 MMCRA: 0f00000001000000 SIAR: 0000000000000000 SDAR: 0000000000000000 SIER: 0000000000000000 Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> [mpe: Fix 32 bit build and rework formatting for compactness] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/perf: Make some new raw event codes available in sysfsAnshuman Khandual1-0/+10
This patchset adds some missing event list for POWER7 PMU raw events which are exported through sysfs interface. Also updates the ABI documentation to add all the sysfs exported raw events. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/powernv: Enable fetching of platform sensor dataNeelesh Gupta4-1/+70
This patch enables fetching of various platform sensor data through OPAL and expects a sensor handle from the driver to pass to OPAL. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/powernv: Enable reading and updating of system parametersNeelesh Gupta5-2/+310
This patch enables reading and updating of system parameters through OPAL call. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-24powerpc/powernv: Infrastructure to support OPAL async completionNeelesh Gupta3-2/+215
This patch adds support for notifying the clients of their request completion. Clients request for the token before making OPAL call and then wait for the response. This patch uses messaging infrastructure to pull the data to linux by registering itself for the message type OPAL_MSG_ASYNC_COMP. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-03-20audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALLAKASHI Takahiro1-0/+1
Currently AUDITSYSCALL has a long list of architecture depencency: depends on AUDIT && (X86 || PARISC || PPC || S390 || IA64 || UML || SPARC64 || SUPERH || (ARM && AEABI && !OABI_COMPAT) || ALPHA) The purpose of this patch is to replace it with HAVE_ARCH_AUDITSYSCALL for simplicity. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> (arm) Acked-by: Richard Guy Briggs <rgb@redhat.com> (audit) Acked-by: Matt Turner <mattst88@gmail.com> (alpha) Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Eric Paris <eparis@redhat.com>
2014-03-20powerpc, sysfs: Fix CPU hotplug callback registrationSrivatsa S. Bhat1-1/+7
Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); Fix the sysfs code in powerpc by using this latter form of callback registration. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Olof Johansson <olof@lixom.net> Cc: Wang Dongsheng <dongsheng.wang@freescale.com> Cc: Ingo Molnar <mingo@kernel.org> Acked-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-20Merge branch 'pm-cpufreq'Rafael J. Wysocki1-2/+1
* pm-cpufreq: (30 commits) intel_pstate: Set core to min P state during core offline cpufreq: Add stop CPU callback to cpufreq_driver interface cpufreq: Remove unnecessary braces cpufreq: Fix checkpatch errors and warnings cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE|RESUMECHANGE} cpufreq: Do not allow ->setpolicy drivers to provide ->target cpufreq: arm_big_little: set 'physical_cluster' for each CPU cpufreq: arm_big_little: make vexpress driver depend on bL core driver cpufreq: SPEAr: Instantiate as platform_driver cpufreq: Remove unnecessary variable/parameter 'frozen' cpufreq: Remove cpufreq_generic_exit() cpufreq: add 'freq_table' in struct cpufreq_policy cpufreq: Reformat printk() statements cpufreq: Tegra: Use cpufreq_generic_suspend() cpufreq: s5pv210: Use cpufreq_generic_suspend() cpufreq: exynos: Use cpufreq_generic_suspend() cpufreq: Implement cpufreq_generic_suspend() cpufreq: suspend governors on system suspend/hibernate cpufreq: move call to __find_governor() to cpufreq_init_policy() ...