summaryrefslogtreecommitdiffstats
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2017-04-26srcu: Exact tracking of srcu_data structures containing callbacksPaul E. McKenney1-6/+23
The current Tree SRCU implementation schedules a workqueue for every srcu_data covered by a given leaf srcu_node structure having callbacks, even if only one of those srcu_data structures actually contains callbacks. This is clearly inefficient for workloads that don't feature callbacks everywhere all the time. This commit therefore adds an array of masks that are used by the leaf srcu_node structures to track exactly which srcu_data structures contain callbacks. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Mike Galbraith <efault@gmx.de>
2017-04-25bpf: map_get_next_key to return first key on NULLTeng Qin3-13/+18
When iterating through a map, we need to find a key that does not exist in the map so map_get_next_key will give us the first key of the map. This often requires a lot of guessing in production systems. This patch makes map_get_next_key return the first key when the key pointer in the parameter is NULL. Signed-off-by: Teng Qin <qinteng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24kallsyms: Use bounded strnchr() when parsing stringNaveen N. Rao1-1/+1
When parsing for the <module:name> format, we use strchr() to look for the separator, when we know that the module name can't be longer than MODULE_NAME_LEN. Enforce the same using strnchr(). Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Jessica Yu <jeyu@redhat.com>
2017-04-24bpf: make bpf_xdp_adjust_head support mandatoryDaniel Borkmann1-3/+0
Now that also the last in-tree user of the xdp_adjust_head bit has been removed, we can remove the flag from struct bpf_prog altogether. This, at the same time, also makes sure that any future driver for XDP comes with bpf_xdp_adjust_head() support right away. A rejection based on this flag would also mean that tail calls couldn't be used with such driver as per c2002f983767 ("bpf: fix checking xdp_adjust_head on tail calls") fix, thus lets not allow for it in the first place. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-23Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "The (hopefully) final fix for the irq affinity spreading logic" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/affinity: Fix calculating vectors to assign
2017-04-23Merge branch 'for-mingo' of ↵Ingo Molnar19-708/+1871
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Documentation updates. - Miscellaneous fixes. - Parallelize SRCU callback handling (plus overlapping patches). Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-21signal: Make kill_proc_info staticEric W. Biederman1-1/+1
There are no users outside of signal.c so make the function static so the compiler and other developers have that information. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-04-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-5/+19
Both conflict were simple overlapping changes. In the kaweth case, Eric Dumazet's skb_cow() bug fix overlapped the conversion of the driver in net-next to use in-netdev stats. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21rlimit: Properly call security_task_setrlimitEric W. Biederman1-2/+1
Modify do_prlimit to call security_task_setrlimit passing the task whose rlimit we are changing not the tsk->group_leader. In general this should not matter as the lsms implementing security_task_setrlimit apparmor and selinux both examine the task->cred to see what should be allowed on the destination task. That task->cred is shared between tasks created with CLONE_THREAD unless thread keyrings are in play, in which case both apparmor and selinux create duplicate security contexts. So the only time when it will matter which thread is passed to security_task_setrlimit is if one of the threads of a process performs an operation that changes only it's credentials. At which point if a thread has done that we don't want to hide that information from the lsms. So fix the call of security_task_setrlimit. With the removal of tsk->group_leader this makes the code slightly faster, more comprehensible and maintainable. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-04-21net: Remove NET_CORE_BUDGET_USECS from sysctl binary interface.David S. Miller1-1/+0
We are not supposed to add new entries to this thing any more. Thanks to Eric Dumazet for noticing this. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuningMatthew Whitehead1-0/+1
Constants used for tuning are generally a bad idea, especially as hardware changes over time. Replace the constant 2 jiffies with sysctl variable netdev_budget_usecs to enable sysadmins to tune the softirq processing. Also document the variable. For example, a very fast machine might tune this to 1000 microseconds, while my regression testing 486DX-25 needs it to be 4000 microseconds on a nearly idle network to prevent time_squeeze from being incremented. Version 2: changed jiffies to microseconds for predictable units. Signed-off-by: Matthew Whitehead <tedheadster@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21Merge branches 'doc.2017.04.12a', 'fixes.2017.04.19a' and 'srcu.2017.04.21a' ↵Paul E. McKenney19-708/+1871
into HEAD doc.2017.04.12a: Documentation updates fixes.2017.04.19a: Miscellaneous fixes srcu.2017.04.21a: Parallelize SRCU callback handling
2017-04-21rcu: Make non-preemptive schedule be Tasks RCU quiescent statePaul E. McKenney3-2/+23
Currently, a call to schedule() acts as a Tasks RCU quiescent state only if a context switch actually takes place. However, just the call to schedule() guarantees that the calling task has moved off of whatever tracing trampoline that it might have been one previously. This commit therefore plumbs schedule()'s "preempt" parameter into rcu_note_context_switch(), which then records the Tasks RCU quiescent state, but only if this call to schedule() was -not- due to a preemption. To avoid adding overhead to the common-case context-switch path, this commit hides the rcu_note_context_switch() check under an existing non-common-case check. Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-21srcu: Expedite srcu_schedule_cbs_snp() callback invocationPaul E. McKenney1-1/+2
Although Tree SRCU does reduce delays when there is at least one synchronize_srcu_expedited() invocation pending, srcu_schedule_cbs_snp() still waits for SRCU_INTERVAL before invoking callbacks. Since synchronize_srcu_expedited() now posts a callback and waits for that callback to do a wakeup, this destroys the expedited nature of synchronize_srcu_expedited(). This destruction became apparent to Marc Zyngier in the guise of a guest-OS bootup slowdown from five seconds to no fewer than forty seconds. This commit therefore invokes callbacks immediately at the end of the grace period when there is at least one synchronize_srcu_expedited() invocation pending. This brought Marc's guest-OS bootup times back into the realm of reason. Reported-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com>
2017-04-21srcu: Parallelize callback handlingPaul E. McKenney4-131/+545
Peter Zijlstra proposed using SRCU to reduce mmap_sem contention [1,2], however, there are workloads that could result in a high volume of concurrent invocations of call_srcu(), which with current SRCU would result in excessive lock contention on the srcu_struct structure's ->queue_lock, which protects SRCU's callback lists. This commit therefore moves SRCU to per-CPU callback lists, thus greatly reducing contention. Because a given SRCU instance no longer has a single centralized callback list, starting grace periods and invoking callbacks are both more complex than in the single-list Classic SRCU implementation. Starting grace periods and handling callbacks are now handled using an srcu_node tree that is in some ways similar to the rcu_node trees used by RCU-bh, RCU-preempt, and RCU-sched (for example, the srcu_node tree shape is controlled by exactly the same Kconfig options and boot parameters that control the shape of the rcu_node tree). In addition, the old per-CPU srcu_array structure is now named srcu_data and contains an rcu_segcblist structure named ->srcu_cblist for its callbacks (and a spinlock to protect this). The srcu_struct gets an srcu_gp_seq that is used to associate callback segments with the corresponding completion-time grace-period number. These completion-time grace-period numbers are propagated up the srcu_node tree so that the grace-period workqueue handler can determine whether additional grace periods are needed on the one hand and where to look for callbacks that are ready to be invoked. The srcu_barrier() function must now wait on all instances of the per-CPU ->srcu_cblist. Because each ->srcu_cblist is protected by ->lock, srcu_barrier() can remotely add the needed callbacks. In theory, it could also remotely start grace periods, but in practice doing so is complex and racy. And interestingly enough, it is never necessary for srcu_barrier() to start a grace period because srcu_barrier() only enqueues a callback when a callback is already present--and it turns out that a grace period has to have already been started for this pre-existing callback. Furthermore, it is only the callback that srcu_barrier() needs to wait on, not any particular grace period. Therefore, a new rcu_segcblist_entrain() function enqueues the srcu_barrier() function's callback into the same segment occupied by the last pre-existing callback in the list. The special case where all the pre-existing callbacks are on a different list (because they are in the process of being invoked) is handled by enqueuing srcu_barrier()'s callback into the RCU_DONE_TAIL segment, relying on the done-callbacks check that takes place after all callbacks are inovked. Note that the readers use the same algorithm as before. Note that there is a separate srcu_idx that tells the readers what counter to increment. This unfortunately cannot be combined with srcu_gp_seq because they need to be incremented at different times. This commit introduces some ugly #ifdefs in rcutorture. These will go away when I feel good enough about Tree SRCU to ditch Classic SRCU. Some crude performance comparisons, courtesy of a quickly hacked rcuperf asynchronous-grace-period capability: Callback Queuing Overhead ------------------------- # CPUS Classic SRCU Tree SRCU ------ ------------ --------- 2 0.349 us 0.342 us 16 31.66 us 0.4 us 41 --------- 0.417 us The times are the 90th percentiles, a statistic that was chosen to reject the overheads of the occasional srcu_barrier() call needed to avoid OOMing the test machine. The rcuperf test hangs when running Classic SRCU at 41 CPUs, hence the line of dashes. Despite the hacks to both the rcuperf code and that statistics, this is a convincing demonstration of Tree SRCU's performance and scalability advantages. [1] https://lwn.net/Articles/309030/ [2] https://patchwork.kernel.org/patch/5108281/ Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Fix initialization if synchronize_srcu_expedited() called first. ]
2017-04-21padata: get_next is never NULLJason A. Donenfeld1-9/+4
Per Dan's static checker warning, the code that returns NULL was removed in 2010, so this patch updates the comments and fixes the code assumptions. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-20tracing/ftrace: Allow for instances to trigger their own stacktrace probesSteven Rostedt (VMware)1-3/+14
Have the stacktrace function trigger probe trigger stack traces within the instance that they were added to in the set_ftrace_filter. ># cd /sys/kernel/debug/tracing ># mkdir instances/foo ># cd instances/foo ># echo schedule:stacktrace:1 > set_ftrace_filter ># cat trace # tracer: nop # # entries-in-buffer/entries-written: 1/1 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <idle>-0 [001] .N.2 202.585010: <stack trace> => => schedule => schedule_preempt_disabled => do_idle => cpu_startup_entry => start_secondary => verify_cpu Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20tracing/ftrace: Allow for the traceonoff probe be unique to instancesSteven Rostedt (VMware)3-12/+15
Have the traceon/off function probe triggers affect only the instance they are set in. This required making the trace_on/off accessible for other files in the tracing directory. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20tracing/ftrace: Enable snapshot function trigger to work with instancesSteven Rostedt (VMware)1-19/+25
Modify the snapshot probe trigger to work with instances. This way the snapshot function trigger will only affect the instance that it is added to in the set_ftrace_filter file. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20tracing/ftrace: Allow instances to have their own function probesSteven Rostedt (VMware)1-2/+2
Pass around the local trace_array that is the descriptor for tracing instances, when enabling and disabling probes. This by default sets the enable/disable of event probe triggers to work with instances. The other probes will need some more work to get them working with instances. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20tracing/ftrace: Add a better way to pass data via the probe functionsSteven Rostedt (VMware)5-107/+153
With the redesign of the registration and execution of the function probes (triggers), data can now be passed from the setup of the probe to the probe callers that are specific to the trace_array it is on. Although, all probes still only affect the toplevel trace array, this change will allow for instances to have their own probes separated from other instances and the top array. That is, something like the stacktrace probe can be set to trace only in an instance and not the toplevel trace array. This isn't implement yet, but this change sets the ground work for the change. When a probe callback is triggered (someone writes the probe format into set_ftrace_filter), it calls register_ftrace_function_probe() passing in init_data that will be used to initialize the probe. Then for every matching function, register_ftrace_function_probe() will call the probe_ops->init() function with the init data that was passed to it, as well as an address to a place holder that is associated with the probe and the instance. The first occurrence will have a NULL in the pointer. The init() function will then initialize it. If other probes are added, or more functions are part of the probe, the place holder will be passed to the init() function with the place holder data that it was initialized to the last time. Then this place_holder is passed to each of the other probe_ops functions, where it can be used in the function callback. When the probe_ops free() function is called, it can be called either with the rip of the function that is being removed from the probe, or zero, indicating that there are no more functions attached to the probe, and the place holder is about to be freed. This gives the probe_ops a way to free the data it assigned to the place holder if it was allocade during the first init call. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Dynamically create the probe ftrace_ops for the trace_arraySteven Rostedt (VMware)5-57/+146
In order to eventually have each trace_array instance have its own unique set of function probes (triggers), the trace array needs to hold the ops and the filters for the probes. This is the first step to accomplish this. Instead of having the private data of the probe ops point to the trace_array, create a separate list that the trace_array holds. There's only one private_data for a probe, we need one per trace_array. The probe ftrace_ops will be dynamically created for each instance, instead of being static. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20tracing: Pass the trace_array into ftrace_probe_ops functionsSteven Rostedt (VMware)5-30/+51
Pass the trace_array associated to a ftrace_probe_ops into the probe_ops func(), init() and free() functions. The trace_array is the descriptor that describes a tracing instance. This will help create the infrastructure that will allow having function probes unique to tracing instances. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20tracing: Have the trace_array hold the list of registered func probesSteven Rostedt (VMware)5-30/+56
Add a link list to the trace_array to hold func probes that are registered. Currently, all function probes are the same for all instances as it was before, that is, only the top level trace_array holds the function probes. But this lays the ground work to have function probes be attached to individual instances, and having the event trigger only affect events in the given instance. But that work is still to be done. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: If the hash for a probe fails to update then free what was initializedSteven Rostedt (VMware)1-1/+15
If the ftrace_hash_move_and_update_ops() fails, and an ops->free() function exists, then it needs to be called on all the ops that were added by this registration. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Have the function probes call their own functionSteven Rostedt (VMware)2-127/+99
Now that the function probes have their own ftrace_ops, there's no reason to continue using the ftrace_func_hash to find which probe to call in the function callback. The ops that is passed in to the function callback is part of the probe_ops to call. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Have each function probe use its own ftrace_opsSteven Rostedt (VMware)2-148/+99
Have the function probes have their own ftrace_ops, and remove the trace_probe_ops. This simplifies some of the ftrace infrastructure code. Individual entries for each function is still allocated for the use of the output for set_ftrace_filter, but they will be removed soon too. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Have unregister_ftrace_function_probe_func() return a valueSteven Rostedt (VMware)5-14/+17
Currently unregister_ftrace_function_probe_func() is a void function. It does not give any feedback if an error occurred or no item was found to remove and nothing was done. Change it to return status and success if it removed something. Also update the callers to return that feedback to the user. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Add helper function ftrace_hash_move_and_update_ops()Steven Rostedt (VMware)1-52/+53
The processes of updating a ops filter_hash is a bit complex, and requires setting up an old hash to perform the update. This is done exactly the same in two locations for the same reasons. Create a helper function that does it in one place. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Remove data field from ftrace_func_probe structureSteven Rostedt (VMware)5-15/+11
No users of the function probes uses the data field anymore. Remove it, and change the init function to take a void *data parameter instead of a void **data, because the init will just get the data that the registering function was received, and there's no state after it is called. The other functions for ftrace_probe_ops still take the data parameter, but it will currently only be passed NULL. It will stay as a parameter for future data to be passed to these functions. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Remove printing of data in showing of a function probeSteven Rostedt (VMware)1-5/+1
None of the probe users uses the data field anymore of the entry. They all have their own print() function. Remove showing the data field in the generic function as the data field will be going away. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Remove unused unregister_ftrace_function_probe_all() functionSteven Rostedt (VMware)2-20/+3
There are no users of unregister_ftrace_function_probe_all(). The only probe function that is used is unregister_ftrace_function_probe_func(). Rename the internal static function __unregister_ftrace_function_probe() to unregister_ftrace_function_probe_func() and make it global. Also remove the PROBE_TEST_FUNC as it would be always set. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Remove unused unregister_ftrace_function_probe() functionSteven Rostedt (VMware)2-18/+3
Nothing calls unregister_ftrace_function_probe(). Remove it as well as the flag PROBE_TEST_DATA, as this function was the only one to set it. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Convert the rest of the function trigger over to the mapping functionsSteven Rostedt (VMware)1-38/+85
As the data pointer for individual ips will soon be removed and no longer passed to the callback function probe handlers, convert the rest of the function trigger counters over to the new ftrace_func_mapper helper functions. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20tracing: Have the snapshot trigger use the mapping helper functionsSteven Rostedt (VMware)1-8/+44
As the data pointer for individual ips will soon be removed and no longer passed to the callback function probe handlers, convert the snapshot trigger counter over to the new ftrace_func_mapper helper functions. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Added ftrace_func_mapper for function probe triggersSteven Rostedt (VMware)3-15/+210
In order to move the ops to the function probes directly, they need a way to map function ips to their own data without depending on the infrastructure of the function probes, as the data field will be going away. New helper functions are added that are based on the ftrace_hash code. ftrace_func_mapper functions are there to let the probes map ips to their data. These can be allocated by the probe ops, and referenced in the function callbacks. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Pass probe ops to probe functionSteven Rostedt (VMware)5-14/+27
In preparation to cleaning up the probe function registration code, the "data" parameter will eventually be removed from the probe->func() call. Instead it will receive its own "ops" function, in which it can set up its own data that it needs to map. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Remove unused "flags" field from struct ftrace_func_probeSteven Rostedt (VMware)1-1/+0
Nothing uses "flags" in the ftrace_func_probe descriptor. Remove it. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20ftrace: Move the function commands into the tracing directorySteven Rostedt (VMware)1-0/+20
As nothing outside the tracing directory uses the function command mechanism, I'm moving the prototypes out of the include/linux/ftrace.h and into the local kernel/trace/trace.h header. I plan on making them hook to the trace_array structure which is local to kernel/trace, and I do not want to expose it to the rest of the kernel. This requires that the command functions must also be local to tracing. But luckily nothing else uses them. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20Merge tag 'trace-v4.11-rc5-5' of ↵Linus Torvalds2-5/+19
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull two more ftrace fixes from Steven Rostedt: "While continuing my development, I uncovered two more small bugs. One is a race condition when enabling the snapshot function probe trigger. It enables the probe before allocating the snapshot, and if the probe triggers first, it stops tracing with a warning that the snapshot buffer was not allocated. The seconds is that the snapshot file should show how to use it when it is empty. But a bug fix from long ago broke the "is empty" test and the snapshot file no longer displays the help message" * tag 'trace-v4.11-rc5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ring-buffer: Have ring_buffer_iter_empty() return true when empty tracing: Allocate the snapshot buffer before enabling probe
2017-04-20block: remove the errors field from struct requestChristoph Hellwig1-14/+12
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-20blktrace: remove the unused block_rq_abort tracepointChristoph Hellwig1-9/+0
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller6-8/+40
A function in kernel/bpf/syscall.c which got a bug fix in 'net' was moved to kernel/bpf/verifier.c in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20Merge branch 'linus' into irq/coreThomas Gleixner18-131/+199
Pick up upstream fixes to avoid conflicts with pending patches.
2017-04-20genirq/affinity: Fix calculating vectors to assignKeith Busch1-1/+1
The vectors_per_node is calculated from the remaining available vectors. The current vector starts after pre_vectors, so we need to subtract that from the current to properly account for the number of remaining vectors to assign. Fixes: 3412386b531 ("irq/affinity: Fix extra vecs calculation") Reported-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Link: http://lkml.kernel.org/r/1492645870-13019-1-git-send-email-keith.busch@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-20powerpc/kprobes: Fix handling of function offsets on ABIv2Naveen N. Rao1-3/+4
commit 239aeba76409 ("perf powerpc: Fix kprobe and kretprobe handling with kallsyms on ppc64le") changed how we use the offset field in struct kprobe on ABIv2. perf now offsets from the global entry point if an offset is specified and otherwise chooses the local entry point. Fix the same in kernel for kprobe API users. We do this by extending kprobe_lookup_name() to accept an additional parameter to indicate the offset specified with the kprobe registration. If offset is 0, we return the local function entry and return the global entry point otherwise. With: # cd /sys/kernel/debug/tracing/ # echo "p _do_fork" >> kprobe_events # echo "p _do_fork+0x10" >> kprobe_events before this patch: # cat ../kprobes/list c0000000000d0748 k _do_fork+0x8 [DISABLED] c0000000000d0758 k _do_fork+0x18 [DISABLED] c0000000000412b0 k kretprobe_trampoline+0x0 [OPTIMIZED] and after: # cat ../kprobes/list c0000000000d04c8 k _do_fork+0x8 [DISABLED] c0000000000d04d0 k _do_fork+0x10 [DISABLED] c0000000000412b0 k kretprobe_trampoline+0x0 [OPTIMIZED] Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-20kprobes: Convert kprobe_lookup_name() to a functionNaveen N. Rao1-12/+8
The macro is now pretty long and ugly on powerpc. In the light of further changes needed here, convert it to a __weak variant to be over-ridden with a nicer looking function. Suggested-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-20kprobes: Skip preparing optprobe if the probe is ftrace-basedMasami Hiramatsu1-2/+9
Skip preparing optprobe if the probe is ftrace-based, since anyway, it must not be optimized (or already optimized by ftrace). Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-20timer/sysclt: Restrict timer migration sysctl values to 0 and 1Myungho Jung2-1/+3
timer_migration sysctl acts as a boolean switch, so the allowed values should be restricted to 0 and 1. Add the necessary extra fields to the sysctl table entry to enforce that. [ tglx: Rewrote changelog ] Signed-off-by: Myungho Jung <mhjungk@gmail.com> Link: http://lkml.kernel.org/r/1492640690-3550-1-git-send-email-mhjungk@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-19ring-buffer: Have ring_buffer_iter_empty() return true when emptySteven Rostedt (VMware)1-2/+14
I noticed that reading the snapshot file when it is empty no longer gives a status. It suppose to show the status of the snapshot buffer as well as how to allocate and use it. For example: ># cat snapshot # tracer: nop # # # * Snapshot is allocated * # # Snapshot commands: # echo 0 > snapshot : Clears and frees snapshot buffer # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated. # Takes a snapshot of the main buffer. # echo 2 > snapshot : Clears snapshot buffer (but does not allocate or free) # (Doesn't have to be '2' works with any number that # is not a '0' or '1') But instead it just showed an empty buffer: ># cat snapshot # tracer: nop # # entries-in-buffer/entries-written: 0/0 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | What happened was that it was using the ring_buffer_iter_empty() function to see if it was empty, and if it was, it showed the status. But that function was returning false when it was empty. The reason was that the iter header page was on the reader page, and the reader page was empty, but so was the buffer itself. The check only tested to see if the iter was on the commit page, but the commit page was no longer pointing to the reader page, but as all pages were empty, the buffer is also. Cc: stable@vger.kernel.org Fixes: 651e22f2701b ("ring-buffer: Always reset iterator to reader page") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>