summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2012-03-22 19:52:47 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2012-03-22 19:52:47 -0700
commitaab008db8063364dc3c8ccf4981c21124866b395 (patch)
tree72914203f4decb023efdaabd0301a62d742dfa8c /Documentation
parent4f5b1affdda3e0c48cac674182f52004137b0ffc (diff)
parent16c0cfa425b8e1488f7a1873bd112a7a099325f0 (diff)
downloadlinux-aab008db8063364dc3c8ccf4981c21124866b395.tar.bz2
Merge tag 'stable/for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm
Pull cleancache changes from Konrad Rzeszutek Wilk: "This has some patches for the cleancache API that should have been submitted a _long_ time ago. They are basically cleanups: - rename of flush to invalidate - moving reporting of statistics into debugfs - use __read_mostly as necessary. Oh, and also the MAINTAINERS file change. The files (except the MAINTAINERS file) have been in #linux-next for months now. The late addition of MAINTAINERS file is a brain-fart on my side - didn't realize I needed that just until I was typing this up - and I based that patch on v3.3 - so the tree is on top of v3.3." * tag 'stable/for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm: MAINTAINERS: Adding cleancache API to the list. mm: cleancache: Use __read_mostly as appropiate. mm: cleancache: report statistics via debugfs instead of sysfs. mm: zcache/tmem/cleancache: s/flush/invalidate/ mm: cleancache: s/flush/invalidate/
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-kernel-mm-cleancache11
-rw-r--r--Documentation/vm/cleancache.txt41
2 files changed, 21 insertions, 31 deletions
diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-cleancache b/Documentation/ABI/testing/sysfs-kernel-mm-cleancache
deleted file mode 100644
index 662ae646ea12..000000000000
--- a/Documentation/ABI/testing/sysfs-kernel-mm-cleancache
+++ /dev/null
@@ -1,11 +0,0 @@
-What: /sys/kernel/mm/cleancache/
-Date: April 2011
-Contact: Dan Magenheimer <dan.magenheimer@oracle.com>
-Description:
- /sys/kernel/mm/cleancache/ contains a number of files which
- record a count of various cleancache operations
- (sum across all filesystems):
- succ_gets
- failed_gets
- puts
- flushes
diff --git a/Documentation/vm/cleancache.txt b/Documentation/vm/cleancache.txt
index d5c615af10ba..142fbb0f325a 100644
--- a/Documentation/vm/cleancache.txt
+++ b/Documentation/vm/cleancache.txt
@@ -46,10 +46,11 @@ a negative return value indicates failure. A "put_page" will copy a
the pool id, a file key, and a page index into the file. (The combination
of a pool id, a file key, and an index is sometimes called a "handle".)
A "get_page" will copy the page, if found, from cleancache into kernel memory.
-A "flush_page" will ensure the page no longer is present in cleancache;
-a "flush_inode" will flush all pages associated with the specified file;
-and, when a filesystem is unmounted, a "flush_fs" will flush all pages in
-all files specified by the given pool id and also surrender the pool id.
+An "invalidate_page" will ensure the page no longer is present in cleancache;
+an "invalidate_inode" will invalidate all pages associated with the specified
+file; and, when a filesystem is unmounted, an "invalidate_fs" will invalidate
+all pages in all files specified by the given pool id and also surrender
+the pool id.
An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache
to treat the pool as shared using a 128-bit UUID as a key. On systems
@@ -62,12 +63,12 @@ of the kernel (e.g. by "tools" that control cleancache). Or a
cleancache implementation can simply disable shared_init by always
returning a negative value.
-If a get_page is successful on a non-shared pool, the page is flushed (thus
-making cleancache an "exclusive" cache). On a shared pool, the page
-is NOT flushed on a successful get_page so that it remains accessible to
+If a get_page is successful on a non-shared pool, the page is invalidated
+(thus making cleancache an "exclusive" cache). On a shared pool, the page
+is NOT invalidated on a successful get_page so that it remains accessible to
other sharers. The kernel is responsible for ensuring coherency between
cleancache (shared or not), the page cache, and the filesystem, using
-cleancache flush operations as required.
+cleancache invalidate operations as required.
Note that cleancache must enforce put-put-get coherency and get-get
coherency. For the former, if two puts are made to the same handle but
@@ -77,20 +78,20 @@ if a get for a given handle fails, subsequent gets for that handle will
never succeed unless preceded by a successful put with that handle.
Last, cleancache provides no SMP serialization guarantees; if two
-different Linux threads are simultaneously putting and flushing a page
+different Linux threads are simultaneously putting and invalidating a page
with the same handle, the results are indeterminate. Callers must
lock the page to ensure serial behavior.
CLEANCACHE PERFORMANCE METRICS
-Cleancache monitoring is done by sysfs files in the
-/sys/kernel/mm/cleancache directory. The effectiveness of cleancache
+If properly configured, monitoring of cleancache is done via debugfs in
+the /sys/kernel/debug/mm/cleancache directory. The effectiveness of cleancache
can be measured (across all filesystems) with:
succ_gets - number of gets that were successful
failed_gets - number of gets that failed
puts - number of puts attempted (all "succeed")
-flushes - number of flushes attempted
+invalidates - number of invalidates attempted
A backend implementation may provide additional metrics.
@@ -143,7 +144,7 @@ systems.
The core hooks for cleancache in VFS are in most cases a single line
and the minimum set are placed precisely where needed to maintain
-coherency (via cleancache_flush operations) between cleancache,
+coherency (via cleancache_invalidate operations) between cleancache,
the page cache, and disk. All hooks compile into nothingness if
cleancache is config'ed off and turn into a function-pointer-
compare-to-NULL if config'ed on but no backend claims the ops
@@ -184,15 +185,15 @@ or for real kernel-addressable RAM, it makes perfect sense for
transcendent memory.
4) Why is non-shared cleancache "exclusive"? And where is the
- page "flushed" after a "get"? (Minchan Kim)
+ page "invalidated" after a "get"? (Minchan Kim)
The main reason is to free up space in transcendent memory and
-to avoid unnecessary cleancache_flush calls. If you want inclusive,
+to avoid unnecessary cleancache_invalidate calls. If you want inclusive,
the page can be "put" immediately following the "get". If
put-after-get for inclusive becomes common, the interface could
-be easily extended to add a "get_no_flush" call.
+be easily extended to add a "get_no_invalidate" call.
-The flush is done by the cleancache backend implementation.
+The invalidate is done by the cleancache backend implementation.
5) What's the performance impact?
@@ -222,7 +223,7 @@ Some points for a filesystem to consider:
as tmpfs should not enable cleancache)
- To ensure coherency/correctness, the FS must ensure that all
file removal or truncation operations either go through VFS or
- add hooks to do the equivalent cleancache "flush" operations
+ add hooks to do the equivalent cleancache "invalidate" operations
- To ensure coherency/correctness, either inode numbers must
be unique across the lifetime of the on-disk file OR the
FS must provide an "encode_fh" function.
@@ -243,11 +244,11 @@ If cleancache would use the inode virtual address instead of
inode/filehandle, the pool id could be eliminated. But, this
won't work because cleancache retains pagecache data pages
persistently even when the inode has been pruned from the
-inode unused list, and only flushes the data page if the file
+inode unused list, and only invalidates the data page if the file
gets removed/truncated. So if cleancache used the inode kva,
there would be potential coherency issues if/when the inode
kva is reused for a different file. Alternately, if cleancache
-flushed the pages when the inode kva was freed, much of the value
+invalidated the pages when the inode kva was freed, much of the value
of cleancache would be lost because the cache of pages in cleanache
is potentially much larger than the kernel pagecache and is most
useful if the pages survive inode cache removal.