summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2022-05-23 11:46:51 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2022-05-23 11:46:51 -0700
commit1e57930e9f4083ad5854ab6eadffe790a8167fb4 (patch)
tree2056cde00f9578dcf76d9fe3ec89b5cdcd9d2bae /Documentation
parentb2f02e9cdaade3ce565e6973ac313d2f814447f2 (diff)
parentce13389053a347aa9f8ffbfda2238352536e15c9 (diff)
downloadlinux-1e57930e9f4083ad5854ab6eadffe790a8167fb4.tar.bz2
Merge tag 'rcu.2022.05.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU update from Paul McKenney: - Documentation updates - Miscellaneous fixes - Callback-offloading updates, mainly simplifications - RCU-tasks updates, including some -rt fixups, handling of systems with sparse CPU numbering, and a fix for a boot-time race-condition failure - Put SRCU on a memory diet in order to reduce the size of the srcu_struct structure - Torture-test updates fixing some bugs in tests and closing some testing holes - Torture-test updates for the RCU tasks flavors, most notably ensuring that building rcutorture and friends does not change the RCU-tasks-related Kconfig options - Torture-test scripting updates - Expedited grace-period updates, most notably providing milliseconds-scale (not all that) soft real-time response from synchronize_rcu_expedited(). This is also the first time in almost 30 years of RCU that someone other than me has pushed for a reduction in the RCU CPU stall-warning timeout, in this case by more than three orders of magnitude from 21 seconds to 20 milliseconds. This tighter timeout applies only to expedited grace periods * tag 'rcu.2022.05.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits) rcu: Move expedited grace period (GP) work to RT kthread_worker rcu: Introduce CONFIG_RCU_EXP_CPU_STALL_TIMEOUT srcu: Drop needless initialization of sdp in srcu_gp_start() srcu: Prevent expedited GPs and blocking readers from consuming CPU srcu: Add contention check to call_srcu() srcu_data ->lock acquisition srcu: Automatically determine size-transition strategy at boot rcutorture: Make torture.sh allow for --kasan rcutorture: Make torture.sh refscale and rcuscale specify Tasks Trace RCU rcutorture: Make kvm.sh allow more memory for --kasan runs torture: Save "make allmodconfig" .config file scftorture: Remove extraneous "scf" from per_version_boot_params rcutorture: Adjust scenarios' Kconfig options for CONFIG_PREEMPT_DYNAMIC torture: Enable CSD-lock stall reports for scftorture torture: Skip vmlinux check for kvm-again.sh runs scftorture: Adjust for TASKS_RCU Kconfig option being selected rcuscale: Allow rcuscale without RCU Tasks Rude/Trace rcuscale: Allow rcuscale without RCU Tasks refscale: Allow refscale without RCU Tasks Rude/Trace refscale: Allow refscale without RCU Tasks rcutorture: Allow specifying per-scenario stat_interval ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/RCU/Design/Data-Structures/Data-Structures.rst2
-rw-r--r--Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst2
-rw-r--r--Documentation/RCU/Design/Requirements/Requirements.rst36
-rw-r--r--Documentation/RCU/arrayRCU.rst4
-rw-r--r--Documentation/RCU/checklist.rst9
-rw-r--r--Documentation/RCU/rcu.rst13
-rw-r--r--Documentation/RCU/rculist_nulls.rst2
-rw-r--r--Documentation/RCU/stallwarn.rst20
-rw-r--r--Documentation/RCU/whatisRCU.rst18
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt85
10 files changed, 159 insertions, 32 deletions
diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.rst b/Documentation/RCU/Design/Data-Structures/Data-Structures.rst
index f4efd6897b09..b34990c7c377 100644
--- a/Documentation/RCU/Design/Data-Structures/Data-Structures.rst
+++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.rst
@@ -973,7 +973,7 @@ The ``->dynticks`` field counts the corresponding CPU's transitions to
and from either dyntick-idle or user mode, so that this counter has an
even value when the CPU is in dyntick-idle mode or user mode and an odd
value otherwise. The transitions to/from user mode need to be counted
-for user mode adaptive-ticks support (see timers/NO_HZ.txt).
+for user mode adaptive-ticks support (see Documentation/timers/no_hz.rst).
The ``->rcu_need_heavy_qs`` field is used to record the fact that the
RCU core code would really like to see a quiescent state from the
diff --git a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
index 6f89cf1e567d..c9c957c85bac 100644
--- a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
+++ b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
@@ -406,7 +406,7 @@ In earlier implementations, the task requesting the expedited grace
period also drove it to completion. This straightforward approach had
the disadvantage of needing to account for POSIX signals sent to user
tasks, so more recent implemementations use the Linux kernel's
-`workqueues <https://www.kernel.org/doc/Documentation/core-api/workqueue.rst>`__.
+workqueues (see Documentation/core-api/workqueue.rst).
The requesting task still does counter snapshotting and funnel-lock
processing, but the task reaching the top of the funnel lock does a
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
index 45278e2974c0..04ed8bf27a0e 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -370,8 +370,8 @@ pointer fetched by rcu_dereference() may not be used outside of the
outermost RCU read-side critical section containing that
rcu_dereference(), unless protection of the corresponding data
element has been passed from RCU to some other synchronization
-mechanism, most commonly locking or `reference
-counting <https://www.kernel.org/doc/Documentation/RCU/rcuref.txt>`__.
+mechanism, most commonly locking or reference counting
+(see ../../rcuref.rst).
.. |high-quality implementation of C11 memory_order_consume [PDF]| replace:: high-quality implementation of C11 ``memory_order_consume`` [PDF]
.. _high-quality implementation of C11 memory_order_consume [PDF]: http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf
@@ -2654,6 +2654,38 @@ synchronize_rcu(), and rcu_barrier(), respectively. In
three APIs are therefore implemented by separate functions that check
for voluntary context switches.
+Tasks Rude RCU
+~~~~~~~~~~~~~~
+
+Some forms of tracing need to wait for all preemption-disabled regions
+of code running on any online CPU, including those executed when RCU is
+not watching. This means that synchronize_rcu() is insufficient, and
+Tasks Rude RCU must be used instead. This flavor of RCU does its work by
+forcing a workqueue to be scheduled on each online CPU, hence the "Rude"
+moniker. And this operation is considered to be quite rude by real-time
+workloads that don't want their ``nohz_full`` CPUs receiving IPIs and
+by battery-powered systems that don't want their idle CPUs to be awakened.
+
+The tasks-rude-RCU API is also reader-marking-free and thus quite compact,
+consisting of call_rcu_tasks_rude(), synchronize_rcu_tasks_rude(),
+and rcu_barrier_tasks_rude().
+
+Tasks Trace RCU
+~~~~~~~~~~~~~~~
+
+Some forms of tracing need to sleep in readers, but cannot tolerate
+SRCU's read-side overhead, which includes a full memory barrier in both
+srcu_read_lock() and srcu_read_unlock(). This need is handled by a
+Tasks Trace RCU that uses scheduler locking and IPIs to synchronize with
+readers. Real-time systems that cannot tolerate IPIs may build their
+kernels with ``CONFIG_TASKS_TRACE_RCU_READ_MB=y``, which avoids the IPIs at
+the expense of adding full memory barriers to the read-side primitives.
+
+The tasks-trace-RCU API is also reasonably compact,
+consisting of rcu_read_lock_trace(), rcu_read_unlock_trace(),
+rcu_read_lock_trace_held(), call_rcu_tasks_trace(),
+synchronize_rcu_tasks_trace(), and rcu_barrier_tasks_trace().
+
Possible Future Changes
-----------------------
diff --git a/Documentation/RCU/arrayRCU.rst b/Documentation/RCU/arrayRCU.rst
index 4051ea3871ef..a5f2ff8fc54c 100644
--- a/Documentation/RCU/arrayRCU.rst
+++ b/Documentation/RCU/arrayRCU.rst
@@ -33,8 +33,8 @@ Situation 1: Hash Tables
Hash tables are often implemented as an array, where each array entry
has a linked-list hash chain. Each hash chain can be protected by RCU
-as described in the listRCU.txt document. This approach also applies
-to other array-of-list situations, such as radix trees.
+as described in listRCU.rst. This approach also applies to other
+array-of-list situations, such as radix trees.
.. _static_arrays:
diff --git a/Documentation/RCU/checklist.rst b/Documentation/RCU/checklist.rst
index f4545b7c9a63..42cc5d891bd2 100644
--- a/Documentation/RCU/checklist.rst
+++ b/Documentation/RCU/checklist.rst
@@ -140,8 +140,7 @@ over a rather long period of time, but improvements are always welcome!
prevents destructive compiler optimizations. However,
with a bit of devious creativity, it is possible to
mishandle the return value from rcu_dereference().
- Please see rcu_dereference.txt in this directory for
- more information.
+ Please see rcu_dereference.rst for more information.
The rcu_dereference() primitive is used by the
various "_rcu()" list-traversal primitives, such
@@ -151,7 +150,7 @@ over a rather long period of time, but improvements are always welcome!
primitives. This is particularly useful in code that
is common to readers and updaters. However, lockdep
will complain if you access rcu_dereference() outside
- of an RCU read-side critical section. See lockdep.txt
+ of an RCU read-side critical section. See lockdep.rst
to learn what to do about this.
Of course, neither rcu_dereference() nor the "_rcu()"
@@ -323,7 +322,7 @@ over a rather long period of time, but improvements are always welcome!
primitives when the update-side lock is held is that doing so
can be quite helpful in reducing code bloat when common code is
shared between readers and updaters. Additional primitives
- are provided for this case, as discussed in lockdep.txt.
+ are provided for this case, as discussed in lockdep.rst.
One exception to this rule is when data is only ever added to
the linked data structure, and is never removed during any
@@ -480,4 +479,4 @@ over a rather long period of time, but improvements are always welcome!
both rcu_barrier() and synchronize_rcu(), if necessary, using
something like workqueues to to execute them concurrently.
- See rcubarrier.txt for more information.
+ See rcubarrier.rst for more information.
diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst
index 0e03c6ef3147..3cfe01ba9a49 100644
--- a/Documentation/RCU/rcu.rst
+++ b/Documentation/RCU/rcu.rst
@@ -10,9 +10,8 @@ A "grace period" must elapse between the two parts, and this grace period
must be long enough that any readers accessing the item being deleted have
since dropped their references. For example, an RCU-protected deletion
from a linked list would first remove the item from the list, wait for
-a grace period to elapse, then free the element. See the
-:ref:`Documentation/RCU/listRCU.rst <list_rcu_doc>` for more information on
-using RCU with linked lists.
+a grace period to elapse, then free the element. See listRCU.rst for more
+information on using RCU with linked lists.
Frequently Asked Questions
--------------------------
@@ -50,7 +49,7 @@ Frequently Asked Questions
- If I am running on a uniprocessor kernel, which can only do one
thing at a time, why should I wait for a grace period?
- See :ref:`Documentation/RCU/UP.rst <up_doc>` for more information.
+ See UP.rst for more information.
- How can I see where RCU is currently used in the Linux kernel?
@@ -64,13 +63,13 @@ Frequently Asked Questions
- What guidelines should I follow when writing code that uses RCU?
- See the checklist.txt file in this directory.
+ See checklist.rst.
- Why the name "RCU"?
"RCU" stands for "read-copy update".
- :ref:`Documentation/RCU/listRCU.rst <list_rcu_doc>` has more information on where
- this name came from, search for "read-copy update" to find it.
+ listRCU.rst has more information on where this name came from, search
+ for "read-copy update" to find it.
- I hear that RCU is patented? What is with that?
diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst
index a9fc774bc400..ca4692775ad4 100644
--- a/Documentation/RCU/rculist_nulls.rst
+++ b/Documentation/RCU/rculist_nulls.rst
@@ -8,7 +8,7 @@ This section describes how to use hlist_nulls to
protect read-mostly linked lists and
objects using SLAB_TYPESAFE_BY_RCU allocations.
-Please read the basics in Documentation/RCU/listRCU.rst
+Please read the basics in listRCU.rst.
Using 'nulls'
=============
diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst
index 78404625bad2..794837eb519b 100644
--- a/Documentation/RCU/stallwarn.rst
+++ b/Documentation/RCU/stallwarn.rst
@@ -162,6 +162,26 @@ CONFIG_RCU_CPU_STALL_TIMEOUT
Stall-warning messages may be enabled and disabled completely via
/sys/module/rcupdate/parameters/rcu_cpu_stall_suppress.
+CONFIG_RCU_EXP_CPU_STALL_TIMEOUT
+--------------------------------
+
+ Same as the CONFIG_RCU_CPU_STALL_TIMEOUT parameter but only for
+ the expedited grace period. This parameter defines the period
+ of time that RCU will wait from the beginning of an expedited
+ grace period until it issues an RCU CPU stall warning. This time
+ period is normally 20 milliseconds on Android devices. A zero
+ value causes the CONFIG_RCU_CPU_STALL_TIMEOUT value to be used,
+ after conversion to milliseconds.
+
+ This configuration parameter may be changed at runtime via the
+ /sys/module/rcupdate/parameters/rcu_exp_cpu_stall_timeout, however
+ this parameter is checked only at the beginning of a cycle. If you
+ are in a current stall cycle, setting it to a new value will change
+ the timeout for the -next- stall.
+
+ Stall-warning messages may be enabled and disabled completely via
+ /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress.
+
RCU_STALL_DELAY_DELTA
---------------------
diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst
index c34d2212eaca..77ea260efd12 100644
--- a/Documentation/RCU/whatisRCU.rst
+++ b/Documentation/RCU/whatisRCU.rst
@@ -224,7 +224,7 @@ synchronize_rcu()
be delayed. This property results in system resilience in face
of denial-of-service attacks. Code using call_rcu() should limit
update rate in order to gain this same sort of resilience. See
- checklist.txt for some approaches to limiting the update rate.
+ checklist.rst for some approaches to limiting the update rate.
rcu_assign_pointer()
^^^^^^^^^^^^^^^^^^^^
@@ -318,7 +318,7 @@ rcu_dereference()
must prohibit. The rcu_dereference_protected() variant takes
a lockdep expression to indicate which locks must be acquired
by the caller. If the indicated protection is not provided,
- a lockdep splat is emitted. See Documentation/RCU/Design/Requirements/Requirements.rst
+ a lockdep splat is emitted. See Design/Requirements/Requirements.rst
and the API's code comments for more details and example usage.
.. [2] If the list_for_each_entry_rcu() instance might be used by
@@ -399,8 +399,7 @@ for specialized uses, but are relatively uncommon.
This section shows a simple use of the core RCU API to protect a
global pointer to a dynamically allocated structure. More-typical
-uses of RCU may be found in :ref:`listRCU.rst <list_rcu_doc>`,
-:ref:`arrayRCU.rst <array_rcu_doc>`, and :ref:`NMI-RCU.rst <NMI_rcu_doc>`.
+uses of RCU may be found in listRCU.rst, arrayRCU.rst, and NMI-RCU.rst.
::
struct foo {
@@ -482,10 +481,9 @@ So, to sum up:
RCU read-side critical sections that might be referencing that
data item.
-See checklist.txt for additional rules to follow when using RCU.
-And again, more-typical uses of RCU may be found in :ref:`listRCU.rst
-<list_rcu_doc>`, :ref:`arrayRCU.rst <array_rcu_doc>`, and :ref:`NMI-RCU.rst
-<NMI_rcu_doc>`.
+See checklist.rst for additional rules to follow when using RCU.
+And again, more-typical uses of RCU may be found in listRCU.rst,
+arrayRCU.rst, and NMI-RCU.rst.
.. _4_whatisRCU:
@@ -579,7 +577,7 @@ to avoid having to write your own callback::
kfree_rcu(old_fp, rcu);
-Again, see checklist.txt for additional rules governing the use of RCU.
+Again, see checklist.rst for additional rules governing the use of RCU.
.. _5_whatisRCU:
@@ -663,7 +661,7 @@ been able to write-acquire the lock otherwise. The smp_mb__after_spinlock()
promotes synchronize_rcu() to a full memory barrier in compliance with
the "Memory-Barrier Guarantees" listed in:
- Documentation/RCU/Design/Requirements/Requirements.rst
+ Design/Requirements/Requirements.rst
It is possible to nest rcu_read_lock(), since reader-writer locks may
be recursively acquired. Note also that rcu_read_lock() is immune
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3f1cc5e317ed..f8d9af5d51e5 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4893,6 +4893,18 @@
rcupdate.rcu_cpu_stall_timeout= [KNL]
Set timeout for RCU CPU stall warning messages.
+ The value is in seconds and the maximum allowed
+ value is 300 seconds.
+
+ rcupdate.rcu_exp_cpu_stall_timeout= [KNL]
+ Set timeout for expedited RCU CPU stall warning
+ messages. The value is in milliseconds
+ and the maximum allowed value is 21000
+ milliseconds. Please note that this value is
+ adjusted to an arch timer tick resolution.
+ Setting this to zero causes the value from
+ rcupdate.rcu_cpu_stall_timeout to be used (after
+ conversion from seconds to milliseconds).
rcupdate.rcu_expedited= [KNL]
Use expedited grace-period primitives, for
@@ -4955,10 +4967,34 @@
number avoids disturbing real-time workloads,
but lengthens grace periods.
+ rcupdate.rcu_task_stall_info= [KNL]
+ Set initial timeout in jiffies for RCU task stall
+ informational messages, which give some indication
+ of the problem for those not patient enough to
+ wait for ten minutes. Informational messages are
+ only printed prior to the stall-warning message
+ for a given grace period. Disable with a value
+ less than or equal to zero. Defaults to ten
+ seconds. A change in value does not take effect
+ until the beginning of the next grace period.
+
+ rcupdate.rcu_task_stall_info_mult= [KNL]
+ Multiplier for time interval between successive
+ RCU task stall informational messages for a given
+ RCU tasks grace period. This value is clamped
+ to one through ten, inclusive. It defaults to
+ the value three, so that the first informational
+ message is printed 10 seconds into the grace
+ period, the second at 40 seconds, the third at
+ 160 seconds, and then the stall warning at 600
+ seconds would prevent a fourth at 640 seconds.
+
rcupdate.rcu_task_stall_timeout= [KNL]
- Set timeout in jiffies for RCU task stall warning
- messages. Disable with a value less than or equal
- to zero.
+ Set timeout in jiffies for RCU task stall
+ warning messages. Disable with a value less
+ than or equal to zero. Defaults to ten minutes.
+ A change in value does not take effect until
+ the beginning of the next grace period.
rcupdate.rcu_self_test= [KNL]
Run the RCU early boot self tests
@@ -5377,6 +5413,17 @@
smart2= [HW]
Format: <io1>[,<io2>[,...,<io8>]]
+ smp.csd_lock_timeout= [KNL]
+ Specify the period of time in milliseconds
+ that smp_call_function() and friends will wait
+ for a CPU to release the CSD lock. This is
+ useful when diagnosing bugs involving CPUs
+ disabling interrupts for extended periods
+ of time. Defaults to 5,000 milliseconds, and
+ setting a value of zero disables this feature.
+ This feature may be more efficiently disabled
+ using the csdlock_debug- kernel parameter.
+
smsc-ircc2.nopnp [HW] Don't use PNP to discover SMC devices
smsc-ircc2.ircc_cfg= [HW] Device configuration I/O port
smsc-ircc2.ircc_sir= [HW] SIR base I/O port
@@ -5608,6 +5655,30 @@
off: Disable mitigation and remove
performance impact to RDRAND and RDSEED
+ srcutree.big_cpu_lim [KNL]
+ Specifies the number of CPUs constituting a
+ large system, such that srcu_struct structures
+ should immediately allocate an srcu_node array.
+ This kernel-boot parameter defaults to 128,
+ but takes effect only when the low-order four
+ bits of srcutree.convert_to_big is equal to 3
+ (decide at boot).
+
+ srcutree.convert_to_big [KNL]
+ Specifies under what conditions an SRCU tree
+ srcu_struct structure will be converted to big
+ form, that is, with an rcu_node tree:
+
+ 0: Never.
+ 1: At init_srcu_struct() time.
+ 2: When rcutorture decides to.
+ 3: Decide at boot time (default).
+ 0x1X: Above plus if high contention.
+
+ Either way, the srcu_node tree will be sized based
+ on the actual runtime number of CPUs (nr_cpu_ids)
+ instead of the compile-time CONFIG_NR_CPUS.
+
srcutree.counter_wrap_check [KNL]
Specifies how frequently to check for
grace-period sequence counter wrap for the
@@ -5625,6 +5696,14 @@
expediting. Set to zero to disable automatic
expediting.
+ srcutree.small_contention_lim [KNL]
+ Specifies the number of update-side contention
+ events per jiffy will be tolerated before
+ initiating a conversion of an srcu_struct
+ structure to big form. Note that the value of
+ srcutree.convert_to_big must have the 0x10 bit
+ set for contention-based conversions to occur.
+
ssbd= [ARM64,HW]
Speculative Store Bypass Disable control