diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2019-07-08 16:12:03 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2019-07-08 16:12:03 -0700 |
commit | e1928328699a582a540b105e5f4c160832a7fdcb (patch) | |
tree | f36bb303b8648189d7b5a7feb27e58fe9fe3b9f0 /Documentation | |
parent | 46f1ec23a46940846f86a91c46f7119d8a8b5de1 (diff) | |
parent | 9156e545765e467e6268c4814cfa609ebb16237e (diff) | |
download | linux-e1928328699a582a540b105e5f4c160832a7fdcb.tar.bz2 |
Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"The main changes in this cycle are:
- rwsem scalability improvements, phase #2, by Waiman Long, which are
rather impressive:
"On a 2-socket 40-core 80-thread Skylake system with 40 reader
and writer locking threads, the min/mean/max locking operations
done in a 5-second testing window before the patchset were:
40 readers, Iterations Min/Mean/Max = 1,807/1,808/1,810
40 writers, Iterations Min/Mean/Max = 1,807/50,344/151,255
After the patchset, they became:
40 readers, Iterations Min/Mean/Max = 30,057/31,359/32,741
40 writers, Iterations Min/Mean/Max = 94,466/95,845/97,098"
There's a lot of changes to the locking implementation that makes
it similar to qrwlock, including owner handoff for more fair
locking.
Another microbenchmark shows how across the spectrum the
improvements are:
"With a locking microbenchmark running on 5.1 based kernel, the
total locking rates (in kops/s) on a 2-socket Skylake system
with equal numbers of readers and writers (mixed) before and
after this patchset were:
# of Threads Before Patch After Patch
------------ ------------ -----------
2 2,618 4,193
4 1,202 3,726
8 802 3,622
16 729 3,359
32 319 2,826
64 102 2,744"
The changes are extensive and the patch-set has been through
several iterations addressing various locking workloads. There
might be more regressions, but unless they are pathological I
believe we want to use this new implementation as the baseline
going forward.
- jump-label optimizations by Daniel Bristot de Oliveira: the primary
motivation was to remove IPI disturbance of isolated RT-workload
CPUs, which resulted in the implementation of batched jump-label
updates. Beyond the improvement of the real-time characteristics
kernel, in one test this patchset improved static key update
overhead from 57 msecs to just 1.4 msecs - which is a nice speedup
as well.
- atomic64_t cross-arch type cleanups by Mark Rutland: over the last
~10 years of atomic64_t existence the various types used by the
APIs only had to be self-consistent within each architecture -
which means they became wildly inconsistent across architectures.
Mark puts and end to this by reworking all the atomic64
implementations to use 's64' as the base type for atomic64_t, and
to ensure that this type is consistently used for parameters and
return values in the API, avoiding further problems in this area.
- A large set of small improvements to lockdep by Yuyang Du: type
cleanups, output cleanups, function return type and othr cleanups
all around the place.
- A set of percpu ops cleanups and fixes by Peter Zijlstra.
- Misc other changes - please see the Git log for more details"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits)
locking/lockdep: increase size of counters for lockdep statistics
locking/atomics: Use sed(1) instead of non-standard head(1) option
locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING
x86/jump_label: Make tp_vec_nr static
x86/percpu: Optimize raw_cpu_xchg()
x86/percpu, sched/fair: Avoid local_clock()
x86/percpu, x86/irq: Relax {set,get}_irq_regs()
x86/percpu: Relax smp_processor_id()
x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}()
locking/rwsem: Guard against making count negative
locking/rwsem: Adaptive disabling of reader optimistic spinning
locking/rwsem: Enable time-based spinning on reader-owned rwsem
locking/rwsem: Make rwsem->owner an atomic_long_t
locking/rwsem: Enable readers spinning on writer
locking/rwsem: Clarify usage of owner's nonspinaable bit
locking/rwsem: Wake up almost all readers in wait queue
locking/rwsem: More optimal RT task handling of null owner
locking/rwsem: Always release wait_lock before waking up tasks
locking/rwsem: Implement lock handoff to prevent lock starvation
locking/rwsem: Make rwsem_spin_on_owner() return owner state
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/atomic_t.txt | 9 | ||||
-rw-r--r-- | Documentation/locking/lockdep-design.txt | 112 |
2 files changed, 91 insertions, 30 deletions
diff --git a/Documentation/atomic_t.txt b/Documentation/atomic_t.txt index b3afe69d03a1..0ab747e0d5ac 100644 --- a/Documentation/atomic_t.txt +++ b/Documentation/atomic_t.txt @@ -81,9 +81,11 @@ Non-RMW ops: The non-RMW ops are (typically) regular LOADs and STOREs and are canonically implemented using READ_ONCE(), WRITE_ONCE(), smp_load_acquire() and -smp_store_release() respectively. +smp_store_release() respectively. Therefore, if you find yourself only using +the Non-RMW operations of atomic_t, you do not in fact need atomic_t at all +and are doing it wrong. -The one detail to this is that atomic_set{}() should be observable to the RMW +A subtle detail of atomic_set{}() is that it should be observable to the RMW ops. That is: C atomic-set @@ -200,6 +202,9 @@ These helper barriers exist because architectures have varying implicit ordering on their SMP atomic primitives. For example our TSO architectures provide full ordered atomics and these barriers are no-ops. +NOTE: when the atomic RmW ops are fully ordered, they should also imply a +compiler barrier. + Thus: atomic_fetch_add(); diff --git a/Documentation/locking/lockdep-design.txt b/Documentation/locking/lockdep-design.txt index 39fae143c9cb..f189d130e543 100644 --- a/Documentation/locking/lockdep-design.txt +++ b/Documentation/locking/lockdep-design.txt @@ -15,34 +15,48 @@ tens of thousands of) instantiations. For example a lock in the inode struct is one class, while each inode has its own instantiation of that lock class. -The validator tracks the 'state' of lock-classes, and it tracks -dependencies between different lock-classes. The validator maintains a -rolling proof that the state and the dependencies are correct. - -Unlike an lock instantiation, the lock-class itself never goes away: when -a lock-class is used for the first time after bootup it gets registered, -and all subsequent uses of that lock-class will be attached to this -lock-class. +The validator tracks the 'usage state' of lock-classes, and it tracks +the dependencies between different lock-classes. Lock usage indicates +how a lock is used with regard to its IRQ contexts, while lock +dependency can be understood as lock order, where L1 -> L2 suggests that +a task is attempting to acquire L2 while holding L1. From lockdep's +perspective, the two locks (L1 and L2) are not necessarily related; that +dependency just means the order ever happened. The validator maintains a +continuing effort to prove lock usages and dependencies are correct or +the validator will shoot a splat if incorrect. + +A lock-class's behavior is constructed by its instances collectively: +when the first instance of a lock-class is used after bootup the class +gets registered, then all (subsequent) instances will be mapped to the +class and hence their usages and dependecies will contribute to those of +the class. A lock-class does not go away when a lock instance does, but +it can be removed if the memory space of the lock class (static or +dynamic) is reclaimed, this happens for example when a module is +unloaded or a workqueue is destroyed. State ----- -The validator tracks lock-class usage history into 4 * nSTATEs + 1 separate -state bits: +The validator tracks lock-class usage history and divides the usage into +(4 usages * n STATEs + 1) categories: +where the 4 usages can be: - 'ever held in STATE context' - 'ever held as readlock in STATE context' - 'ever held with STATE enabled' - 'ever held as readlock with STATE enabled' -Where STATE can be either one of (kernel/locking/lockdep_states.h) - - hardirq - - softirq +where the n STATEs are coded in kernel/locking/lockdep_states.h and as of +now they include: +- hardirq +- softirq +where the last 1 category is: - 'ever used' [ == !unused ] -When locking rules are violated, these state bits are presented in the -locking error messages, inside curlies. A contrived example: +When locking rules are violated, these usage bits are presented in the +locking error messages, inside curlies, with a total of 2 * n STATEs bits. +A contrived example: modprobe/2287 is trying to acquire lock: (&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24 @@ -51,28 +65,67 @@ locking error messages, inside curlies. A contrived example: (&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24 -The bit position indicates STATE, STATE-read, for each of the states listed -above, and the character displayed in each indicates: +For a given lock, the bit positions from left to right indicate the usage +of the lock and readlock (if exists), for each of the n STATEs listed +above respectively, and the character displayed at each bit position +indicates: '.' acquired while irqs disabled and not in irq context '-' acquired in irq context '+' acquired with irqs enabled '?' acquired in irq context with irqs enabled. -Unused mutexes cannot be part of the cause of an error. +The bits are illustrated with an example: + + (&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24 + |||| + ||| \-> softirq disabled and not in softirq context + || \--> acquired in softirq context + | \---> hardirq disabled and not in hardirq context + \----> acquired in hardirq context + + +For a given STATE, whether the lock is ever acquired in that STATE +context and whether that STATE is enabled yields four possible cases as +shown in the table below. The bit character is able to indicate which +exact case is for the lock as of the reporting time. + + ------------------------------------------- + | | irq enabled | irq disabled | + |-------------------------------------------| + | ever in irq | ? | - | + |-------------------------------------------| + | never in irq | + | . | + ------------------------------------------- + +The character '-' suggests irq is disabled because if otherwise the +charactor '?' would have been shown instead. Similar deduction can be +applied for '+' too. + +Unused locks (e.g., mutexes) cannot be part of the cause of an error. Single-lock state rules: ------------------------ +A lock is irq-safe means it was ever used in an irq context, while a lock +is irq-unsafe means it was ever acquired with irq enabled. + A softirq-unsafe lock-class is automatically hardirq-unsafe as well. The -following states are exclusive, and only one of them is allowed to be -set for any lock-class: +following states must be exclusive: only one of them is allowed to be set +for any lock-class based on its usage: + + <hardirq-safe> or <hardirq-unsafe> + <softirq-safe> or <softirq-unsafe> - <hardirq-safe> and <hardirq-unsafe> - <softirq-safe> and <softirq-unsafe> +This is because if a lock can be used in irq context (irq-safe) then it +cannot be ever acquired with irq enabled (irq-unsafe). Otherwise, a +deadlock may happen. For example, in the scenario that after this lock +was acquired but before released, if the context is interrupted this +lock will be attempted to acquire twice, which creates a deadlock, +referred to as lock recursion deadlock. -The validator detects and reports lock usage that violate these +The validator detects and reports lock usage that violates these single-lock state rules. Multi-lock dependency rules: @@ -81,15 +134,18 @@ Multi-lock dependency rules: The same lock-class must not be acquired twice, because this could lead to lock recursion deadlocks. -Furthermore, two locks may not be taken in different order: +Furthermore, two locks can not be taken in inverse order: <L1> -> <L2> <L2> -> <L1> -because this could lead to lock inversion deadlocks. (The validator -finds such dependencies in arbitrary complexity, i.e. there can be any -other locking sequence between the acquire-lock operations, the -validator will still track all dependencies between locks.) +because this could lead to a deadlock - referred to as lock inversion +deadlock - as attempts to acquire the two locks form a circle which +could lead to the two contexts waiting for each other permanently. The +validator will find such dependency circle in arbitrary complexity, +i.e., there can be any other locking sequence between the acquire-lock +operations; the validator will still find whether these locks can be +acquired in a circular fashion. Furthermore, the following usage based lock dependencies are not allowed between any two lock-classes: |