From 621934ee7ed5b073c7fd638b347e632c53572761 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 4 Oct 2006 02:17:02 -0700 Subject: [PATCH] srcu-3: RCU variant permitting read-side blocking Updated patch adding a variant of RCU that permits sleeping in read-side critical sections. SRCU is as follows: o Each use of SRCU creates its own srcu_struct, and each srcu_struct has its own set of grace periods. This is critical, as it prevents one subsystem with a blocking reader from holding up SRCU grace periods for other subsystems. o The SRCU primitives (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu()) all take a pointer to a srcu_struct. o The SRCU primitives must be called from process context. o srcu_read_lock() returns an int that must be passed to the matching srcu_read_unlock(). Realtime RCU avoids the need for this by storing the state in the task struct, but SRCU needs to allow a given code path to pass through multiple SRCU domains -- storing state in the task struct would therefore require either arbitrary space in the task struct or arbitrary limits on SRCU nesting. So I kicked the state-storage problem up to the caller. Of course, it is not permitted to call synchronize_srcu() while in an SRCU read-side critical section. o There is no call_srcu(). It would not be hard to implement one, but it seems like too easy a way to OOM the system. (Hey, we have enough trouble with call_rcu(), which does -not- permit readers to sleep!!!) So, if you want it, please tell me why... [josht@us.ibm.com: sparse notation] Signed-off-by: Paul E. McKenney Signed-off-by: Josh Triplett Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/RCU/checklist.txt | 38 ++++++++++++++++++++++++++++++++++++++ Documentation/RCU/rcu.txt | 3 ++- Documentation/RCU/whatisRCU.txt | 3 +++ 3 files changed, 43 insertions(+), 1 deletion(-) (limited to 'Documentation/RCU') diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index 1d50cf0c905e..f4dffadbcb00 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt @@ -221,3 +221,41 @@ over a rather long period of time, but improvements are always welcome! disable irq on a given acquisition of that lock will result in deadlock as soon as the RCU callback happens to interrupt that acquisition's critical section. + +13. SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu()) + may only be invoked from process context. Unlike other forms of + RCU, it -is- permissible to block in an SRCU read-side critical + section (demarked by srcu_read_lock() and srcu_read_unlock()), + hence the "SRCU": "sleepable RCU". Please note that if you + don't need to sleep in read-side critical sections, you should + be using RCU rather than SRCU, because RCU is almost always + faster and easier to use than is SRCU. + + Also unlike other forms of RCU, explicit initialization + and cleanup is required via init_srcu_struct() and + cleanup_srcu_struct(). These are passed a "struct srcu_struct" + that defines the scope of a given SRCU domain. Once initialized, + the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock() + and synchronize_srcu(). A given synchronize_srcu() waits only + for SRCU read-side critical sections governed by srcu_read_lock() + and srcu_read_unlock() calls that have been passd the same + srcu_struct. This property is what makes sleeping read-side + critical sections tolerable -- a given subsystem delays only + its own updates, not those of other subsystems using SRCU. + Therefore, SRCU is less prone to OOM the system than RCU would + be if RCU's read-side critical sections were permitted to + sleep. + + The ability to sleep in read-side critical sections does not + come for free. First, corresponding srcu_read_lock() and + srcu_read_unlock() calls must be passed the same srcu_struct. + Second, grace-period-detection overhead is amortized only + over those updates sharing a given srcu_struct, rather than + being globally amortized as they are for other forms of RCU. + Therefore, SRCU should be used in preference to rw_semaphore + only in extremely read-intensive situations, or in situations + requiring SRCU's read-side deadlock immunity or low read-side + realtime latency. + + Note that, rcu_assign_pointer() and rcu_dereference() relate to + SRCU just as they do to other forms of RCU. diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt index 02e27bf1d365..f84407cba816 100644 --- a/Documentation/RCU/rcu.txt +++ b/Documentation/RCU/rcu.txt @@ -45,7 +45,8 @@ o How can I see where RCU is currently used in the Linux kernel? Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu", "rcu_read_lock_bh", "rcu_read_unlock_bh", "call_rcu_bh", - "synchronize_rcu", and "synchronize_net". + "srcu_read_lock", "srcu_read_unlock", "synchronize_rcu", + "synchronize_net", and "synchronize_srcu". o What guidelines should I follow when writing code that uses RCU? diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 820fee236967..e0d6d99b8f9b 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -778,6 +778,8 @@ Markers for RCU read-side critical sections: rcu_read_unlock rcu_read_lock_bh rcu_read_unlock_bh + srcu_read_lock + srcu_read_unlock RCU pointer/list traversal: @@ -804,6 +806,7 @@ RCU grace period: synchronize_net synchronize_sched synchronize_rcu + synchronize_srcu call_rcu call_rcu_bh -- cgit v1.2.3