summaryrefslogtreecommitdiffstats
path: root/fs/ceph/addr.c
AgeCommit message (Collapse)AuthorFilesLines
2017-03-02sched/headers: Prepare for the reduction of <linux/sched.h>'s signal API ↵Ingo Molnar1-0/+1
dependency Instead of including the full <linux/signal.h>, we are going to include the types-only <linux/signal_types.h> header in <linux/sched.h>, to further decouple the scheduler header from the signal headers. This means that various files which relied on the full <linux/signal.h> need to be updated to gain an explicit dependency on it. Update the code that relies on sched.h's inclusion of the <linux/signal.h> header. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-28Merge tag 'ceph-for-4.11-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds1-11/+8
Pull ceph updates from Ilya Dryomov: "This time around we have: - support for rbd data-pool feature, which enables rbd images on erasure-coded pools (myself). CEPH_PG_MAX_SIZE has been bumped to allow erasure-coded profiles with k+m up to 32. - a patch for ceph_d_revalidate() performance regression introduced in 4.9, along with some cleanups in the area (Jeff Layton) - a set of fixes for unsafe ->d_parent accesses in CephFS (Jeff Layton) - buffered reads are now processed in rsize windows instead of rasize windows (Andreas Gerstmayr). The new default for rsize mount option is 64M. - ack vs commit distinction is gone, greatly simplifying ->fsync() and MOSDOpReply handling code (myself) ... also a few filesystem bug fixes from Zheng, a CRUSH sync up (CRUSH computations are still serialized though) and several minor fixes and cleanups all over" * tag 'ceph-for-4.11-rc1' of git://github.com/ceph/ceph-client: (52 commits) libceph, rbd, ceph: WRITE | ONDISK -> WRITE libceph: get rid of ack vs commit ceph: remove special ack vs commit behavior ceph: tidy some white space in get_nonsnap_parent() crush: fix dprintk compilation crush: do is_out test only if we do not collide ceph: remove req from unsafe list when unregistering it rbd: constify device_type structure rbd: kill obj_request->object_name and rbd_segment_name_cache rbd: store and use obj_request->object_no rbd: RBD_V{1,2}_DATA_FORMAT macros rbd: factor out __rbd_osd_req_create() rbd: set offset and length outside of rbd_obj_request_create() rbd: support for data-pool feature rbd: introduce rbd_init_layout() rbd: use rbd_obj_bytes() more rbd: remove now unused rbd_obj_request_wait() and helpers rbd: switch rbd_obj_method_sync() to ceph_osdc_call() libceph: pass reply buffer length through ceph_osdc_call() rbd: do away with obj_request in rbd_obj_read_sync() ...
2017-02-27fs: add i_blocksize()Fabian Frederick1-1/+1
Replace all 1 << inode->i_blkbits and (1 << inode->i_blkbits) in fs branch. This patch also fixes multiple checkpatch warnings: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' Thanks to Andrew Morton for suggesting more appropriate function instead of macro. [geliangtang@gmail.com: truncate: use i_blocksize()] Link: http://lkml.kernel.org/r/9c8b2cd83c8f5653805d43debde9fa8817e02fc4.1484895804.git.geliangtang@gmail.com Link: http://lkml.kernel.org/r/1481319905-10126-1-git-send-email-fabf@skynet.be Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmfDave Jiang1-3/+5
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24libceph, rbd, ceph: WRITE | ONDISK -> WRITEIlya Dryomov1-9/+5
CEPH_OSD_FLAG_ONDISK is set in account_request(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
2017-02-24ceph: remove special ack vs commit behaviorIlya Dryomov1-1/+1
- ask for a commit reply instead of an ack reply in __ceph_pool_perm_get() - don't ask for both ack and commit replies in ceph_sync_write() - since just only one reply is requested now, i_unsafe_writes list will always be empty -- kill ceph_sync_write_wait() and go back to a standard ->evict_inode() Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
2017-02-20ceph: update readpages osd request according to size of pagesYan, Zheng1-0/+1
add_to_page_cache_lru() can fails, so the actual pages to read can be smaller than the initial size of osd request. We need to update osd request size in that case. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com>
2017-02-20ceph: cleanup ACCESS_ONCE -> READ_ONCESeraphime Kirkovski1-2/+2
This removes the uses of ACCESS_ONCE in favor of READ_ONCE Signed-off-by: Seraphime Kirkovski <kirkseraph@gmail.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
2017-01-12ceph: fix get_oldest_context()Geng, Jichao1-2/+2
For no snapshot case, we should use ci->truncate_{seq,size}. Fixes: 5f743e456606 ("ceph: record truncate size/seq for snap data writeback") Signed-off-by: Geng, Jichao <geng.jichao@h3c.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-12-17Merge branch 'for-linus' of ↵Linus Torvalds1-6/+8
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: "In this pile: - autofs-namespace series - dedupe stuff - more struct path constification" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits) ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features ocfs2: charge quota for reflinked blocks ocfs2: fix bad pointer cast ocfs2: always unlock when completing dio writes ocfs2: don't eat io errors during _dio_end_io_write ocfs2: budget for extent tree splits when adding refcount flag ocfs2: prohibit refcounted swapfiles ocfs2: add newlines to some error messages ocfs2: convert inode refcount test to a helper simple_write_end(): don't zero in short copy into uptodate exofs: don't mess with simple_write_{begin,end} 9p: saner ->write_end() on failing copy into non-uptodate page fix gfs2_stuffed_write_end() on short copies fix ceph_write_end() nfs_write_end(): fix handling of short copies vfs: refactor clone/dedupe_file_range common functions fs: try to clone files first in vfs_copy_file_range vfs: misc struct path constification namespace.c: constify struct path passed to a bunch of primitives quota: constify struct path in quota_on ...
2016-12-14ceph: avoid creating orphan object when checking pool permissionYan, Zheng1-0/+9
Pool permission check needs to write to the first object. But for snapshot, head of the first object may have already been deleted. Skip the check for snapshot inode to avoid creating orphan object. Link: http://tracker.ceph.com/issues/18211 Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-12-12ceph: record truncate size/seq for snap data writebackYan, Zheng1-13/+18
Dirty snapshot data needs to be flushed unconditionally. If they were created before truncation, writeback should use old truncate size/seq. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-12-12ceph: try getting buffer capability for readahead/fadviseYan, Zheng1-10/+48
For readahead/fadvise cases, caller of ceph_readpages does not hold buffer capability. Pages can be added to page cache while there is no buffer capability. This can cause data integrity issue. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-12-10fix ceph_write_end()Al Viro1-6/+8
don't zero on short copies; if the page was uptodate it's just plain wrong, and if it wasn't we'll be better off just returning 0 and buggering off. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-03ceph: remove warning when ceph_releasepage() is called on dirty pageNeilBrown1-3/+2
If O_DIRECT writes are racing with buffered writes, then the call to invalidate_inode_pages2_range() can call ceph_releasepage() on dirty pages. Most filesystems hold inode_lock() across O_DIRECT writes so they do not suffer this race, but cephfs deliberately drops the lock, and opens a window for the race. This race can be triggered with the generic/036 test from the xfstests test suite. It doesn't happen every time, but it does happen often. As the possibilty is expected, remove the warning, and instead include the PageDirty() status in the debug message. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Yan, Zheng <zyan@redhat.com>
2016-10-03ceph: fix error handling of start_read()Yan, Zheng1-10/+9
If start_page() fails to add a page to page cache or fails to send OSD request. It should cal put_page() (instead of free_page()) for relevant pages. Besides, start_page() need to cancel fscache readpage if it fails to send OSD request. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reported-by: Zhi Zhang <zhang.david2011@gmail.com>
2016-07-28ceph: rados pool namespace supportYan, Zheng1-15/+52
This patch adds codes that decode pool namespace information in cap message and request reply. Pool namespace is saved in i_layout, it will be passed to libceph when doing read/write. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28libceph: define new ceph_file_layout structureYan, Zheng1-9/+9
Define new ceph_file_layout structure and rename old ceph_file_layout to ceph_file_layout_legacy. This is preparation for adding namespace to ceph_file_layout structure. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-06-01ceph: disable fscache when inode is opened for writeYan, Zheng1-2/+0
All other filesystems do not add dirty pages to fscache. They all disable fscache when inode is opened for write. Only ceph adds dirty pages to fscache, but the code is buggy. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-06-01ceph: call __fscache_uncache_page() if readpages failsYan, Zheng1-1/+3
If readpages fails, fscache needs to cleanup its internal state. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-05-26Merge branch 'for-linus' of ↵Linus Torvalds1-92/+122
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "This changeset has a few main parts: - Ilya has finished a huge refactoring effort to sync up the client-side logic in libceph with the user-space client code, which has evolved significantly over the last couple years, with lots of additional behaviors (e.g., how requests are handled when cluster is full and transitions from full to non-full). This structure of the code is more closely aligned with userspace now such that it will be much easier to maintain going forward when behavior changes take place. There are some locking improvements bundled in as well. - Zheng adds multi-filesystem support (multiple namespaces within the same Ceph cluster) - Zheng has changed the readdir offsets and directory enumeration so that dentry offsets are hash-based and therefore stable across directory fragmentation events on the MDS. - Zheng has a smorgasbord of bug fixes across fs/ceph" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (71 commits) ceph: fix wake_up_session_cb() ceph: don't use truncate_pagecache() to invalidate read cache ceph: SetPageError() for writeback pages if writepages fails ceph: handle interrupted ceph_writepage() ceph: make ceph_update_writeable_page() uninterruptible libceph: make ceph_osdc_wait_request() uninterruptible ceph: handle -EAGAIN returned by ceph_update_writeable_page() ceph: make fault/page_mkwrite return VM_FAULT_OOM for -ENOMEM ceph: block non-fatal signals for fault/page_mkwrite ceph: make logical calculation functions return bool ceph: tolerate bad i_size for symlink inode ceph: improve fragtree change detection ceph: keep leaf frag when updating fragtree ceph: fix dir_auth check in ceph_fill_dirfrag() ceph: don't assume frag tree splits in mds reply are sorted ceph: fix inode reference leak ceph: using hash value to compose dentry offset ceph: don't forbid marking directory complete after forward seek ceph: record 'offset' for each entry of readdir result ceph: define 'end/complete' in readdir reply as bit flags ...
2016-05-26ceph: SetPageError() for writeback pages if writepages failsYan, Zheng1-1/+3
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-05-26ceph: handle interrupted ceph_writepage()Yan, Zheng1-4/+18
writepage() can be interrupted when it's called by direct memory reclaimer (the direct memory relaimer is killed). To avoid lossing data, we redirty the page. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-05-26ceph: make ceph_update_writeable_page() uninterruptibleYan, Zheng1-1/+1
ceph_update_writeable_page() is used by ceph_write_begin(). It beaks atomicity of write operation if it's interruptible. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-05-26ceph: handle -EAGAIN returned by ceph_update_writeable_page()Yan, Zheng1-13/+15
when ceph_update_writeable_page() return -EAGAIN, caller should lock the page and call ceph_update_writeable_page() again. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-05-26ceph: make fault/page_mkwrite return VM_FAULT_OOM for -ENOMEMYan, Zheng1-20/+17
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-05-26ceph: block non-fatal signals for fault/page_mkwriteYan, Zheng1-27/+39
Fault and page_mkwrite are supposed to be uninterruptable. But they call ceph functions that are interruptible. So they should block signals before calling functions that are interruptible Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-05-26ceph: don't call truncate_pagecache in ceph_writepages_startYan, Zheng1-2/+12
truncate_pagecache() may decrease inode's reference. This can cause deadlock if inode's last reference is dropped and iput_final() wants to evict the inode. (evict() calls inode_wait_for_writeback(), which waits for ceph_writepages_start() to return). The fix is use work thead to truncate dirty pages. Also add 'forced umount' check to ceph_update_writeable_page(), which prevents new pages getting dirty. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-05-26libceph: redo callbacks and factor out MOSDOpReply decodingIlya Dryomov1-2/+1
If you specify ACK | ONDISK and set ->r_unsafe_callback, both ->r_callback and ->r_unsafe_callback(true) are called on ack. This is very confusing. Redo this so that only one of them is called: ->r_unsafe_callback(true), on ack ->r_unsafe_callback(false), on commit or ->r_callback, on ack|commit Decode everything in decode_MOSDOpReply() to reduce clutter. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: drop msg argument from ceph_osdc_callback_tIlya Dryomov1-5/+4
finish_read(), its only user, uses it to get to hdr.data_len, which is what ->r_result is set to on success. This gains us the ability to safely call callbacks from contexts other than reply, e.g. map check. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: switch to calc_target(), part 2Ilya Dryomov1-12/+4
The crux of this is getting rid of ceph_osdc_build_request(), so that MOSDOp can be encoded not before but after calc_target() calculates the actual target. Encoding now happens within ceph_osdc_start_request(). Also nuked is the accompanying bunch of pointers into the encoded buffer that was used to update fields on each send - instead, the entire front is re-encoded. If we want to support target->name_len != base->name_len in the future, there is no other way, because oid is surrounded by other fields in the encoded buffer. Encoding OSD ops and adding data items to the request message were mixed together in osd_req_encode_op(). While we want to re-encode OSD ops, we don't want to add duplicate data items to the message when resending, so all call to ceph_osdc_msg_data_add() are factored out into a new setup_request_data(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: introduce ceph_osd_request_target, calc_target()Ilya Dryomov1-1/+1
Introduce ceph_osd_request_target, containing all mapping-related fields of ceph_osd_request and calc_target() for calculating mappings and populating it. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: variable-sized ceph_object_idIlya Dryomov1-4/+2
Currently ceph_object_id can hold object names of up to 100 (CEPH_MAX_OID_NAME_LEN) characters. This is enough for all use cases, expect one - long rbd image names: - a format 1 header is named "<imgname>.rbd" - an object that points to a format 2 header is named "rbd_id.<imgname>" We operate on these potentially long-named objects during rbd map, and, for format 1 images, during header refresh. (A format 2 header name is a small system-generated string.) Lift this 100 character limit by making ceph_object_id be able to point to an externally-allocated string. Apart from being able to work with almost arbitrarily-long named objects, this allows us to reduce the size of ceph_object_id from >100 bytes to 64 bytes. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: move message allocation out of ceph_osdc_alloc_request()Ilya Dryomov1-0/+8
The size of ->r_request and ->r_reply messages depends on the size of the object name (ceph_object_id), while the size of ceph_osd_request is fixed. Move message allocation into a separate function that would have to be called after ceph_object_id and ceph_object_locator (which is also going to become variable in size with RADOS namespaces) have been filled in: req = ceph_osdc_alloc_request(...); <fill in req->r_base_oid> <fill in req->r_base_oloc> ceph_osdc_alloc_messages(req); Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-26libceph: make ceph_osdc_put_request() accept NULLIlya Dryomov1-6/+3
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-05-01direct-io: eliminate the offset argument to ->direct_IOChristoph Hellwig1-2/+1
Including blkdev_direct_IO and dax_do_io. It has to be ki_pos to actually work, so eliminate the superflous argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov1-57/+57
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-25ceph: remove unnecessary NULL checkYan, Zheng1-2/+2
If page->mapping is NULL, releasepage() callback does not get called. Remove the unnecessary NULL check to make static code analysis tool happy Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-03-25ceph: kill ceph_empty_snapcIlya Dryomov1-8/+5
ceph_empty_snapc->num_snaps == 0 at all times. Passing such a snapc to ceph_osdc_alloc_request() (possibly through ceph_osdc_new_request()) is equivalent to passing NULL, as ceph_osdc_alloc_request() uses it only for sizing the request message. Further, in all four cases the subsequent ceph_osdc_build_request() is passed NULL for snapc, meaning that 0 is encoded for seq and num_snaps and making ceph_empty_snapc entirely useless. The two cases where it actually mattered were removed in commits 860560904962 ("ceph: avoid sending unnessesary FLUSHSNAP message") and 23078637e054 ("ceph: fix queuing inode to mdsdir's snaprealm"). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Yan, Zheng <zyan@redhat.com>
2016-03-25ceph: fix a wrong comparisonAnton Protopopov1-1/+1
A negative value rc compared to the positive value ENOENT in the finish_read() function. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-03-25ceph: scattered page writebackYan, Zheng1-109/+196
This patch makes ceph_writepages_start() try using single OSD request to write all dirty pages within a strip unit. When a nonconsecutive dirty page is found, ceph_writepages_start() tries starting a new write operation to existing OSD request. If it succeeds, it uses the new operation to writeback the dirty page. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-03-04ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 supportYan, Zheng1-0/+4
Add support for the format change of MClientReply/MclientCaps. Also add code that denies access to inodes with pool_ns layouts. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
2016-01-21ceph: use i_size_{read,write} to get/set i_sizeYan, Zheng1-3/+2
Cap message from MDS can update i_size. In that case, we don't hold i_mutex. So it's unsafe to directly access inode->i_size while holding i_mutex. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21ceph: Avoid to propagate the invalid page pointMinfei Huang1-1/+0
The variant pagep will still get the invalid page point, although ceph fails in function ceph_update_writeable_page. To fix this issue, Assigne the page to pagep until there is no failure in function ceph_update_writeable_page. Signed-off-by: Minfei Huang <mnfhuang@gmail.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21ceph: fix double page_unlock() in page_mkwrite()Yan, Zheng1-4/+4
ceph_update_writeable_page() unlocks the page on errors, so page_mkwrite() should not unlock the page again. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-11-06mm, fs: introduce mapping_gfp_constraint()Michal Hocko1-3/+4
There are many places which use mapping_gfp_mask to restrict a more generic gfp mask which would be used for allocations which are not directly related to the page cache but they are performed in the same context. Let's introduce a helper function which makes the restriction explicit and easier to track. This patch doesn't introduce any functional changes. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Michal Hocko <mhocko@suse.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-11Merge branch 'for-linus' of ↵Linus Torvalds1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph update from Sage Weil: "There are a few fixes for snapshot behavior with CephFS and support for the new keepalive protocol from Zheng, a libceph fix that affects both RBD and CephFS, a few bug fixes and cleanups for RBD from Ilya, and several small fixes and cleanups from Jianpeng and others" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: improve readahead for file holes ceph: get inode size for each append write libceph: check data_len in ->alloc_msg() libceph: use keepalive2 to verify the mon session is alive rbd: plug rbd_dev->header.object_prefix memory leak rbd: fix double free on rbd_dev->header_name libceph: set 'exists' flag for newly up osd ceph: cleanup use of ceph_msg_get ceph: no need to get parent inode in ceph_open ceph: remove the useless judgement ceph: remove redundant test of head->safe and silence static analysis warnings ceph: fix queuing inode to mdsdir's snaprealm libceph: rename con_work() to ceph_con_workfn() libceph: Avoid holding the zero page on ceph_msgr_slab_init errors libceph: remove the unused macro AES_KEY_SIZE ceph: invalidate dirty pages after forced umount ceph: EIO all operations after forced umount
2015-09-10mm: mark most vm_operations_struct constKirill A. Shutemov1-1/+1
With two exceptions (drm/qxl and drm/radeon) all vm_operations_struct structs should be constant. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-09ceph: improve readahead for file holesYan, Zheng1-1/+1
When readahead encounters file holes, osd reply returns error -ENOENT, finish_read() skips adding pages to the the page cache. So readahead does not work for file holes. The fix is adding zero pages to the page cache when -ENOENT is returned. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-09-08ceph: invalidate dirty pages after forced umountYan, Zheng1-0/+2
After forced umount, ceph_writepages_start() skips flushing dirty pages. To make sure inode's reference count get dropped to zero, we need to invalidate dirty pages. Signed-off-by: Yan, Zheng <zyan@redhat.com>