summaryrefslogtreecommitdiffstats
path: root/fs/afs
AgeCommit message (Collapse)AuthorFilesLines
2021-11-10afs: Use folios in directory handlingDavid Howells2-209/+174
Convert the AFS directory handling code to use folios. With these changes, afs passes -g quick xfstests. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing@auristor.com cc: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Jeff Layton <jlayton@kernel.org> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/162877312172.3085614.992850861791211206.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/162981154845.1901565.2078707403143240098.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/163005746215.2472992.8321380998443828308.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163584190457.4023316.10544419117563104940.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/CAH2r5mtECQA6K_OGgU=_G8qLY3G-6-jo1odVyF9EK+O2-EWLFg@mail.gmail.com/ # v3 Link: https://lore.kernel.org/r/163649330345.309189.11182522282723655658.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/163657854055.834781.5800946340537517009.stgit@warthog.procyon.org.uk/ # v5
2021-11-10netfs, 9p, afs, ceph: Use foliosDavid Howells3-234/+229
Convert the netfs helper library to use folios throughout, convert the 9p and afs filesystems to use folios in their file I/O paths and convert the ceph filesystem to use just enough folios to compile. With these changes, afs passes -g quick xfstests. Changes ======= ver #5: - Got rid of folio_end{io,_read,_write}() and inlined the stuff it does instead (Willy decided he didn't want this after all). ver #4: - Fixed a bug in afs_redirty_page() whereby it didn't set the next page index in the loop and returned too early. - Simplified a check in v9fs_vfs_write_folio_locked()[1]. - Undid a change to afs_symlink_readpage()[1]. - Used offset_in_folio() in afs_write_end()[1]. - Changed from using page_endio() to folio_end{io,_read,_write}()[1]. ver #2: - Add 9p foliation. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dominique Martinet <asmadeus@codewreck.org> Tested-by: kafs-testing@auristor.com cc: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Marc Dionne <marc.dionne@auristor.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Dominique Martinet <asmadeus@codewreck.org> cc: v9fs-developer@lists.sourceforge.net cc: linux-afs@lists.infradead.org cc: ceph-devel@vger.kernel.org cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/YYKa3bfQZxK5/wDN@casper.infradead.org/ [1] Link: https://lore.kernel.org/r/2408234.1628687271@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/162877311459.3085614.10601478228012245108.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/162981153551.1901565.3124454657133703341.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/163005745264.2472992.9852048135392188995.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163584187452.4023316.500389675405550116.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/163649328026.309189.1124218109373941936.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/163657852454.834781.9265101983152100556.stgit@warthog.procyon.org.uk/ # v5
2021-11-02Merge tag 'afs-next-20211102' of ↵Linus Torvalds4-28/+27
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS updates from David Howells: - Split the readpage handler for symlinks from the one for files. The symlink readpage isn't given a file pointer, so the handling has to be special-cased. This has been posted as part of a patchset to foliate netfs, afs, etc.[1] but I've moved it to this one as it's not actually doing foliation but is more of a pre-cleanup. - Fix file creation to set the mtime from the client's clock to keep make happy if the server's clock isn't quite in sync.[2] Link: https://lore.kernel.org/r/163005742570.2472992.7800423440314043178.stgit@warthog.procyon.org.uk/ [1] Link: http://lists.infradead.org/pipermail/linux-afs/2021-October/004395.html [2] * tag 'afs-next-20211102' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Set mtime from the client for yfs create operations afs: Sort out symlink reading
2021-11-02afs: Set mtime from the client for yfs create operationsMarc Dionne1-19/+13
For operations that create vnodes on the server such as CreateFile, MakeDir or Symlink, the server will store its own current time as the mtime if the client doesn't pass in a time in the accompanying StoreStatus structure. If the server and client clocks are not well synchronized, the client may see timestamps in the future or inconsistent dependency checks with "make" for files that are not modified after creation: make[2]: Warning: File 'arch/x86/kernel/apic/modules.order' has modification time 0.14 s in the future make[2]: warning: Clock skew detected. Your build may be incomplete. This is already handled correctly for non yfs operations; also set the mtime for the corresponding yfs operations. Changes: v3: Replace S_IRWXUGO with 0777, per checkpatch v2: [dhowells] Merge the two xdr_encode_YFSStoreStatus*() functions together Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: http://lists.infradead.org/pipermail/linux-afs/2021-October/004395.html
2021-11-02afs: Sort out symlink readingDavid Howells3-9/+14
afs_readpage() doesn't get a file pointer when called for a symlink, so separate it from regular file pointer handling. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Link: https://lore.kernel.org/r/162687508008.276387.6418924257569297305.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/162981152280.1901565.2264055504466731917.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/163005742570.2472992.7800423440314043178.stgit@warthog.procyon.org.uk/ # v2
2021-11-01Merge tag 'folio-5.16' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds1-4/+5
Pull memory folios from Matthew Wilcox: "Add memory folios, a new type to represent either order-0 pages or the head page of a compound page. This should be enough infrastructure to support filesystems converting from pages to folios. The point of all this churn is to allow filesystems and the page cache to manage memory in larger chunks than PAGE_SIZE. The original plan was to use compound pages like THP does, but I ran into problems with some functions expecting only a head page while others expect the precise page containing a particular byte. The folio type allows a function to declare that it's expecting only a head page. Almost incidentally, this allows us to remove various calls to VM_BUG_ON(PageTail(page)) and compound_head(). This converts just parts of the core MM and the page cache. For 5.17, we intend to convert various filesystems (XFS and AFS are ready; other filesystems may make it) and also convert more of the MM and page cache to folios. For 5.18, multi-page folios should be ready. The multi-page folios offer some improvement to some workloads. The 80% win is real, but appears to be an artificial benchmark (postgres startup, which isn't a serious workload). Real workloads (eg building the kernel, running postgres in a steady state, etc) seem to benefit between 0-10%. I haven't heard of any performance losses as a result of this series. Nobody has done any serious performance tuning; I imagine that tweaking the readahead algorithm could provide some more interesting wins. There are also other places where we could choose to create large folios and currently do not, such as writes that are larger than PAGE_SIZE. I'd like to thank all my reviewers who've offered review/ack tags: Christoph Hellwig, David Howells, Jan Kara, Jeff Layton, Johannes Weiner, Kirill A. Shutemov, Michal Hocko, Mike Rapoport, Vlastimil Babka, William Kucharski, Yu Zhao and Zi Yan. I'd also like to thank those who gave feedback I incorporated but haven't offered up review tags for this part of the series: Nick Piggin, Mel Gorman, Ming Lei, Darrick Wong, Ted Ts'o, John Hubbard, Hugh Dickins, and probably a few others who I forget" * tag 'folio-5.16' of git://git.infradead.org/users/willy/pagecache: (90 commits) mm/writeback: Add folio_write_one mm/filemap: Add FGP_STABLE mm/filemap: Add filemap_get_folio mm/filemap: Convert mapping_get_entry to return a folio mm/filemap: Add filemap_add_folio() mm/filemap: Add filemap_alloc_folio mm/page_alloc: Add folio allocation functions mm/lru: Add folio_add_lru() mm/lru: Convert __pagevec_lru_add_fn to take a folio mm: Add folio_evictable() mm/workingset: Convert workingset_refault() to take a folio mm/filemap: Add readahead_folio() mm/filemap: Add folio_mkwrite_check_truncate() mm/filemap: Add i_blocks_per_folio() mm/writeback: Add folio_redirty_for_writepage() mm/writeback: Add folio_account_redirty() mm/writeback: Add folio_clear_dirty_for_io() mm/writeback: Add folio_cancel_dirty() mm/writeback: Add folio_account_cleaned() mm/writeback: Add filemap_dirty_folio() ...
2021-10-07Merge tag 'misc-fixes-20211007' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull netfslib, cachefiles and afs fixes from David Howells: - Fix another couple of oopses in cachefiles tracing stemming from the possibility of passing in a NULL object pointer - Fix netfs_clear_unread() to set READ on the iov_iter so that source it is passed to doesn't do the wrong thing (some drivers look at the flag on iov_iter rather than other available information to determine the direction) - Fix afs_launder_page() to write back at the correct file position on the server so as not to corrupt data * tag 'misc-fixes-20211007' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix afs_launder_page() to set correct start file position netfs: Fix READ/WRITE confusion when calling iov_iter_xarray() cachefiles: Fix oops with cachefiles_cull() due to NULL object
2021-10-05afs: Fix afs_launder_page() to set correct start file positionDavid Howells1-2/+1
Fix afs_launder_page() to set the starting position of the StoreData RPC at the offset into the page at which the modified data starts instead of at the beginning of the page (the iov_iter is correctly offset). The offset got lost during the conversion to passing an iov_iter into afs_store_data(). Changes: ver #2: - Use page_offset() rather than manually calculating it[1]. Fixes: bd80d8a80e12 ("afs: Use ITER_XARRAY for writing") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/YST/0e92OdSH0zjg@casper.infradead.org/ [1] Link: https://lore.kernel.org/r/162880783179.3421678.7795105718190440134.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/162937512409.1449272.18441473411207824084.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/162981148752.1901565.3663780601682206026.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163005741670.2472992.2073548908229887941.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163221839087.3143591.14278359695763025231.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163292980654.4004896.7134735179887998551.stgit@warthog.procyon.org.uk/ # v2
2021-10-04afs: Fix kerneldoc warning shown up by W=1David Howells1-2/+2
Fix a kerneldoc warning in afs due to a partially documented internal function by removing the kerneldoc marker. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org cc: linux-doc@vger.kernel.org Link: https://lore.kernel.org/r/163214005516.2945267.7000234432243167892.stgit@warthog.procyon.org.uk/ # rfc v1 Link: https://lore.kernel.org/r/163281899704.2790286.9177774252843775348.stgit@warthog.procyon.org.uk/ # rfc v2
2021-09-27mm/writeback: Add folio_wait_writeback()Matthew Wilcox (Oracle)1-4/+5
wait_on_page_writeback_killable() only has one caller, so convert it to call folio_wait_writeback_killable(). For the wait_on_page_writeback() callers, add a compatibility wrapper around folio_wait_writeback(). Turning PageWriteback() into folio_test_writeback() eliminates a call to compound_head() which saves 8 bytes and 15 bytes in the two functions. Unfortunately, that is more than offset by adding the wait_on_page_writeback compatibility wrapper for a net increase in text of 7 bytes. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jeff Layton <jlayton@kernel.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Howells <dhowells@redhat.com>
2021-09-13afs: Fix updating of i_blocks on file/dir extensionDavid Howells4-13/+13
When an afs file or directory is modified locally such that the total file size is extended, i_blocks needs to be recalculated too. Fix this by making afs_write_end() and afs_edit_dir_add() call afs_set_i_size() rather than setting inode->i_size directly as that also recalculates inode->i_blocks. This can be tested by creating and writing into directories and files and then examining them with du. Without this change, directories show a 4 blocks (they start out at 2048 bytes) and files show 0 blocks; with this change, they should show a number of blocks proportional to the file size rounded up to 1024. Fixes: 31143d5d515e ("AFS: implement basic file write support") Fixes: 63a4681ff39c ("afs: Locally edit directory data for mkdir/create/unlink/...") Reported-by: Markus Suvanto <markus.suvanto@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/163113612442.352844.11162345591911691150.stgit@warthog.procyon.org.uk/
2021-09-13afs: Fix corruption in reads at fpos 2G-4G from an OpenAFS serverDavid Howells5-12/+49
AFS-3 has two data fetch RPC variants, FS.FetchData and FS.FetchData64, and Linux's afs client switches between them when talking to a non-YFS server if the read size, the file position or the sum of the two have the upper 32 bits set of the 64-bit value. This is a problem, however, since the file position and length fields of FS.FetchData are *signed* 32-bit values. Fix this by capturing the capability bits obtained from the fileserver when it's sent an FS.GetCapabilities RPC, rather than just discarding them, and then picking out the VICED_CAPABILITY_64BITFILES flag. This can then be used to decide whether to use FS.FetchData or FS.FetchData64 - and also FS.StoreData or FS.StoreData64 - rather than using upper_32_bits() to switch on the parameter values. This capabilities flag could also be used to limit the maximum size of the file, but all servers must be checked for that. Note that the issue does not exist with FS.StoreData - that uses *unsigned* 32-bit values. It's also not a problem with Auristor servers as its YFS.FetchData64 op uses unsigned 64-bit values. This can be tested by cloning a git repo through an OpenAFS client to an OpenAFS server and then doing "git status" on it from a Linux afs client[1]. Provided the clone has a pack file that's in the 2G-4G range, the git status will show errors like: error: packfile .git/objects/pack/pack-5e813c51d12b6847bbc0fcd97c2bca66da50079c.pack does not match index error: packfile .git/objects/pack/pack-5e813c51d12b6847bbc0fcd97c2bca66da50079c.pack does not match index This can be observed in the server's FileLog with something like the following appearing: Sun Aug 29 19:31:39 2021 SRXAFS_FetchData, Fid = 2303380852.491776.3263114, Host 192.168.11.201:7001, Id 1001 Sun Aug 29 19:31:39 2021 CheckRights: len=0, for host=192.168.11.201:7001 Sun Aug 29 19:31:39 2021 FetchData_RXStyle: Pos 18446744071815340032, Len 3154 Sun Aug 29 19:31:39 2021 FetchData_RXStyle: file size 2400758866 ... Sun Aug 29 19:31:40 2021 SRXAFS_FetchData returns 5 Note the file position of 18446744071815340032. This is the requested file position sign-extended. Fixes: b9b1f8d5930a ("AFS: write support fixes") Reported-by: Markus Suvanto <markus.suvanto@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: linux-afs@lists.infradead.org cc: openafs-devel@openafs.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217#c9 [1] Link: https://lore.kernel.org/r/951332.1631308745@warthog.procyon.org.uk/
2021-09-13afs: Try to avoid taking RCU read lock when checking vnode validityDavid Howells4-45/+48
Try to avoid taking the RCU read lock when checking the validity of a vnode's callback state. The only thing it's needed for is to pin the parent volume's server list whilst we search it to find the record of the server we're currently using to see if it has been reinitialised (ie. it sent us a CB.InitCallBackState* RPC). Do this by the following means: (1) Keep an additional per-cell counter (fs_s_break) that's incremented each time any of the fileservers in the cell reinitialises. Since the new counter can be accessed without RCU from the vnode, we can check that first - and only if it differs, get the RCU read lock and check the volume's server list. (2) Replace afs_get_s_break_rcu() with afs_check_server_good() which now indicates whether the callback promise is still expected to be present on the server. This does the checks as described in (1). (3) Restructure afs_check_validity() to take account of the change in (2). We can also get rid of the valid variable and just use the need_clear variable with the addition of the afs_cb_break_no_promise reason. (4) afs_check_validity() probably shouldn't be altering vnode->cb_v_break and vnode->cb_s_break when it doesn't have cb_lock exclusively locked. Move the change to vnode->cb_v_break to __afs_break_callback(). Delegate the change to vnode->cb_s_break to afs_select_fileserver() and set vnode->cb_fs_s_break there also. (5) afs_validate() no longer needs to get the RCU read lock around its call to afs_check_validity() - and can skip the call entirely if we don't have a promise. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/163111669583.283156.1397603105683094563.stgit@warthog.procyon.org.uk/
2021-09-13afs: Fix mmap coherency vs 3rd-party changesDavid Howells6-3/+119
Fix the coherency management of mmap'd data such that 3rd-party changes become visible as soon as possible after the callback notification is delivered by the fileserver. This is done by the following means: (1) When we break a callback on a vnode specified by the CB.CallBack call from the server, we queue a work item (vnode->cb_work) to go and clobber all the PTEs mapping to that inode. This causes the CPU to trip through the ->map_pages() and ->page_mkwrite() handlers if userspace attempts to access the page(s) again. (Ideally, this would be done in the service handler for CB.CallBack, but the server is waiting for our reply before considering, and we have a list of vnodes, all of which need breaking - and the process of getting the mmap_lock and stripping the PTEs on all CPUs could be quite slow.) (2) Call afs_validate() from the ->map_pages() handler to check to see if the file has changed and to get a new callback promise from the server. Also handle the fileserver telling us that it's dropping all callbacks, possibly after it's been restarted by sending us a CB.InitCallBackState* call by the following means: (3) Maintain a per-cell list of afs files that are currently mmap'd (cell->fs_open_mmaps). (4) Add a work item to each server that is invoked if there are any open mmaps when CB.InitCallBackState happens. This work item goes through the aforementioned list and invokes the vnode->cb_work work item for each one that is currently using this server. This causes the PTEs to be cleared, causing ->map_pages() or ->page_mkwrite() to be called again, thereby calling afs_validate() again. I've chosen to simply strip the PTEs at the point of notification reception rather than invalidate all the pages as well because (a) it's faster, (b) we may get a notification for other reasons than the data being altered (in which case we don't want to clobber the pagecache) and (c) we need to ask the server to find out - and I don't want to wait for the reply before holding up userspace. This was tested using the attached test program: #include <stdbool.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <sys/mman.h> int main(int argc, char *argv[]) { size_t size = getpagesize(); unsigned char *p; bool mod = (argc == 3); int fd; if (argc != 2 && argc != 3) { fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]); exit(2); } fd = open(argv[1], mod ? O_RDWR : O_RDONLY); if (fd < 0) { perror(argv[1]); exit(1); } p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ, MAP_SHARED, fd, 0); if (p == MAP_FAILED) { perror("mmap"); exit(1); } for (;;) { if (mod) { p[0]++; msync(p, size, MS_ASYNC); fsync(fd); } printf("%02x", p[0]); fflush(stdout); sleep(1); } } It runs in two modes: in one mode, it mmaps a file, then sits in a loop reading the first byte, printing it and sleeping for a second; in the second mode it mmaps a file, then sits in a loop incrementing the first byte and flushing, then printing and sleeping. Two instances of this program can be run on different machines, one doing the reading and one doing the writing. The reader should see the changes made by the writer, but without this patch, they aren't because validity checking is being done lazily - only on entry to the filesystem. Testing the InitCallBackState change is more complicated. The server has to be taken offline, the saved callback state file removed and then the server restarted whilst the reading-mode program continues to run. The client machine then has to poke the server to trigger the InitCallBackState call. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
2021-09-13afs: Fix incorrect triggering of sillyrename on 3rd-party invalidationDavid Howells1-39/+7
The AFS filesystem is currently triggering the silly-rename cleanup from afs_d_revalidate() when it sees that a dentry has been changed by a third party[1]. It should not be doing this as the cleanup includes deleting the silly-rename target file on iput. Fix this by removing the places in the d_revalidate handling that validate anything other than the directory and the dirent. It probably should not be looking to validate the target inode of the dentry also. This includes removing the point in afs_d_revalidate() where the inode that a dentry used to point to was marked as being deleted (AFS_VNODE_DELETED). We don't know it got deleted. It could have been renamed or it could have hard links remaining. This was reproduced by cloning a git repo onto an afs volume on one machine, switching to another machine and doing "git status", then switching back to the first and doing "git status". The second status would show weird output due to ".git/index" getting deleted by the above mentioned mechanism. A simpler way to do it is to do: machine 1: touch a machine 2: touch b; mv -f b a machine 1: stat a on an afs volume. The bug shows up as the stat failing with ENOENT and the file server log showing that machine 1 deleted "a". Fixes: 79ddbfa500b3 ("afs: Implement sillyrename for unlink and rename") Reported-by: Markus Suvanto <markus.suvanto@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: linux-afs@lists.infradead.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217#c4 [1] Link: https://lore.kernel.org/r/163111668100.283156.3851669884664475428.stgit@warthog.procyon.org.uk/
2021-09-13afs: Add missing vnode validation checksDavid Howells3-3/+41
afs_d_revalidate() should only be validating the directory entry it is given and the directory to which that belongs; it shouldn't be validating the inode/vnode to which that dentry points. Besides, validation need to be done even if we don't call afs_d_revalidate() - which might be the case if we're starting from a file descriptor. In order for afs_d_revalidate() to be fixed, validation points must be added in some other places. Certain directory operations, such as afs_unlink(), already check this, but not all and not all file operations either. Note that the validation of a vnode not only checks to see if the attributes we have are correct, but also gets a promise from the server to notify us if that file gets changed by a third party. Add the following checks: - Check the vnode we're going to make a hard link to. - Check the vnode we're going to move/rename. - Check the vnode we're going to read from. - Check the vnode we're going to write to. - Check the vnode we're going to sync. - Check the vnode we're going to make a mapped page writable for. Some of these aren't strictly necessary as we're going to perform a server operation that might get the attributes anyway from which we can determine if something changed - though it might not get us a callback promise. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/163111667354.283156.12720698333342917516.stgit@warthog.procyon.org.uk/
2021-09-10afs: Fix page leakDavid Howells1-2/+8
There's a loop in afs_extend_writeback() that adds extra pages to a write we want to make to improve the efficiency of the writeback by making it larger. This loop stops, however, if we hit a page we can't write back from immediately, but it doesn't get rid of the page ref we speculatively acquired. This was caused by the removal of the cleanup loop when the code switched from using find_get_pages_contig() to xarray scanning as the latter only gets a single page at a time, not a batch. Fix this by putting the page on a ref on an early break from the loop. Unfortunately, we can't just add that page to the pagevec we're employing as we'll go through that and add those pages to the RPC call. This was found by the generic/074 test. It leaks ~4GiB of RAM each time it is run - which can be observed with "top". Fixes: e87b03f5830e ("afs: Prepare for use of THPs") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-and-tested-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/163111666635.283156.177701903478910460.stgit@warthog.procyon.org.uk/
2021-09-10afs: Fix missing put on afs_read objects and missing get on the key thereinDavid Howells1-1/+2
The afs_read objects created by afs_req_issue_op() get leaked because afs_alloc_read() returns a ref and then afs_fetch_data() gets its own ref which is released when the operation completes, but the initial ref is never released. Fix this by discarding the initial ref at the end of afs_req_issue_op(). This leak also covered another bug whereby a ref isn't got on the key attached to the read record by afs_req_issue_op(). This isn't a problem as long as the afs_read req never goes away... Fix this by calling key_get() in afs_req_issue_op(). This was found by the generic/074 test. It leaks a bunch of kmalloc-192 objects each time it is run, which can be observed by watching /proc/slabinfo. Fixes: f7605fa869cf ("afs: Fix leak of afs_read objects") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-and-tested-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/163010394740.3035676.8516846193899793357.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/163111665914.283156.3038561975681836591.stgit@warthog.procyon.org.uk/
2021-08-23fs: remove mandatory file locking supportJeff Layton1-4/+0
We added CONFIG_MANDATORY_FILE_LOCKING in 2015, and soon after turned it off in Fedora and RHEL8. Several other distros have followed suit. I've heard of one problem in all that time: Someone migrated from an older distro that supported "-o mand" to one that didn't, and the host had a fstab entry with "mand" in it which broke on reboot. They didn't actually _use_ mandatory locking so they just removed the mount option and moved on. This patch rips out mandatory locking support wholesale from the kernel, along with the Kconfig option and the Documentation file. It also changes the mount code to ignore the "mand" mount option instead of erroring out, and to throw a big, ugly warning. Signed-off-by: Jeff Layton <jlayton@kernel.org>
2021-07-21afs: Remove redundant assignment to retJiapeng Chong1-4/+6
Variable ret is set to -ENOENT and -ENOMEM but this value is never read as it is overwritten or not used later on, hence it is a redundant assignment and can be removed. Cleans up the following clang-analyzer warning: fs/afs/dir.c:2014:4: warning: Value stored to 'ret' is never read [clang-analyzer-deadcode.DeadStores]. fs/afs/dir.c:659:2: warning: Value stored to 'ret' is never read [clang-analyzer-deadcode.DeadStores]. [DH made the following modifications: - In afs_rename(), -ENOMEM should be placed in op->error instead of ret, rather than the assignment being removed entirely. afs_put_operation() will pick it up from there and return it. - If afs_sillyrename() fails, its error code should be placed in op->error rather than in ret also. ] Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/1619691492-83866-1-git-send-email-jiapeng.chong@linux.alibaba.com Link: https://lore.kernel.org/r/162609465444.3133237.7562832521724298900.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/162610729052.3408253.17364333638838151299.stgit@warthog.procyon.org.uk/ # v2
2021-07-21afs: Fix setting of writeback_indexDavid Howells1-1/+1
Fix afs_writepages() to always set mapping->writeback_index to a page index and not a byte position[1]. Fixes: 31143d5d515e ("AFS: implement basic file write support") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/CAB9dFdvHsLsw7CMnB+4cgciWDSqVjuij4mH3TaXnHQB8sz5rHw@mail.gmail.com/ [1] Link: https://lore.kernel.org/r/162610728339.3408253.4604750166391496546.stgit@warthog.procyon.org.uk/ # v2 (no v1)
2021-07-21afs: check function returnTom Rix1-5/+11
Static analysis reports this problem write.c:773:29: warning: Assigned value is garbage or undefined mapping->writeback_index = next; ^ ~~~~ The call to afs_writepages_region() can return without setting next. So check the function return before using next. Changes: ver #2: - Need to fix the range_cyclic case also[1]. Fixes: e87b03f5830e ("afs: Prepare for use of THPs") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20210430155031.3287870-1-trix@redhat.com Link: https://lore.kernel.org/r/CAB9dFdvHsLsw7CMnB+4cgciWDSqVjuij4mH3TaXnHQB8sz5rHw@mail.gmail.com/ [1] Link: https://lore.kernel.org/r/162609464716.3133237.10354897554363093252.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/162610727640.3408253.8687445613469681311.stgit@warthog.procyon.org.uk/ # v2
2021-07-21afs: Fix tracepoint string placement with built-in AFSDavid Howells1-18/+7
To quote Alexey[1]: I was adding custom tracepoint to the kernel, grabbed full F34 kernel .config, disabled modules and booted whole shebang as VM kernel. Then did perf record -a -e ... It crashed: general protection fault, probably for non-canonical address 0x435f5346592e4243: 0000 [#1] SMP PTI CPU: 1 PID: 842 Comm: cat Not tainted 5.12.6+ #26 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 RIP: 0010:t_show+0x22/0xd0 Then reproducer was narrowed to # cat /sys/kernel/tracing/printk_formats Original F34 kernel with modules didn't crash. So I started to disable options and after disabling AFS everything started working again. The root cause is that AFS was placing char arrays content into a section full of _pointers_ to strings with predictable consequences. Non canonical address 435f5346592e4243 is "CB.YFS_" which came from CM_NAME macro. Steps to reproduce: CONFIG_AFS=y CONFIG_TRACING=y # cat /sys/kernel/tracing/printk_formats Fix this by the following means: (1) Add enum->string translation tables in the event header with the AFS and YFS cache/callback manager operations listed by RPC operation ID. (2) Modify the afs_cb_call tracepoint to print the string from the translation table rather than using the string at the afs_call name pointer. (3) Switch translation table depending on the service we're being accessed as (AFS or YFS) in the tracepoint print clause. Will this cause problems to userspace utilities? Note that the symbolic representation of the YFS service ID isn't available to this header, so I've put it in as a number. I'm not sure if this is the best way to do this. (4) Remove the name wrangling (CM_NAME) macro and put the names directly into the afs_call_type structs in cmservice.c. Fixes: 8e8d7f13b6d5a9 ("afs: Add some tracepoints") Reported-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: Andrew Morton <akpm@linux-foundation.org> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/YLAXfvZ+rObEOdc%2F@localhost.localdomain/ [1] Link: https://lore.kernel.org/r/643721.1623754699@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/162430903582.2896199.6098150063997983353.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/162609463957.3133237.15916579353149746363.stgit@warthog.procyon.org.uk/ # v1 (repost) Link: https://lore.kernel.org/r/162610726860.3408253.445207609466288531.stgit@warthog.procyon.org.uk/ # v2
2021-06-25Merge tag 'netfs-fixes-20210621' of ↵Linus Torvalds1-2/+9
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull netfs fixes from David Howells: "This contains patches to fix netfs_write_begin() and afs_write_end() in the following ways: (1) In netfs_write_begin(), extract the decision about whether to skip a page out to its own helper and have that clear around the region to be written, but not clear that region. This requires the filesystem to patch it up afterwards if the hole doesn't get completely filled. (2) Use offset_in_thp() in (1) rather than manually calculating the offset into the page. (3) Due to (1), afs_write_end() now needs to handle short data write into the page by generic_perform_write(). I've adopted an analogous approach to ceph of just returning 0 in this case and letting the caller go round again. It also adds a note that (in the future) the len parameter may extend beyond the page allocated. This is because the page allocation is deferred to write_begin() and that gets to decide what size of THP to allocate." Jeff Layton points out: "The netfs fix in particular fixes a data corruption bug in cephfs" * tag 'netfs-fixes-20210621' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: netfs: fix test for whether we can skip read when writing beyond EOF afs: Fix afs_write_end() to handle short writes
2021-06-21afs: Fix afs_write_end() to handle short writesDavid Howells1-2/+9
Fix afs_write_end() to correctly handle a short copy into the intended write region of the page. Two things are necessary: (1) If the page is not up to date, then we should just return 0 (ie. indicating a zero-length copy). The loop in generic_perform_write() will go around again, possibly breaking up the iterator into discrete chunks[1]. This is analogous to commit b9de313cf05fe08fa59efaf19756ec5283af672a for ceph. (2) The page should not have been set uptodate if it wasn't completely set up by netfs_write_begin() (this will be fixed in the next patch), so we need to set uptodate here in such a case. Also remove the assertion that was checking that the page was set uptodate since it's now set uptodate if it wasn't already a few lines above. The assertion was from when uptodate was set elsewhere. Changes: v3: Remove the handling of len exceeding the end of the page. Fixes: 3003bbd0697b ("afs: Use the netfs_write_begin() helper") Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Al Viro <viro@zeniv.linux.org.uk> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/YMwVp268KTzTf8cN@zeniv-ca.linux.org.uk/ [1] Link: https://lore.kernel.org/r/162367682522.460125.5652091227576721609.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/162391825688.1173366.3437507255136307904.stgit@warthog.procyon.org.uk/ # v2
2021-06-18afs: Re-enable freezing once a page fault is interruptedMatthew Wilcox (Oracle)1-5/+8
If a task is killed during a page fault, it does not currently call sb_end_pagefault(), which means that the filesystem cannot be frozen at any time thereafter. This may be reported by lockdep like this: ==================================== WARNING: fsstress/10757 still has locks held! 5.13.0-rc4-build4+ #91 Not tainted ------------------------------------ 1 lock held by fsstress/10757: #0: ffff888104eac530 ( sb_pagefaults as filesystem freezing is modelled as a lock. Fix this by removing all the direct returns from within the function, and using 'ret' to indicate whether we were interrupted or successful. Fixes: 1cf7a1518aef ("afs: Implement shared-writeable mmap") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20210616154900.1958373-1-willy@infradead.org/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-15afs: Fix an IS_ERR() vs NULL checkDan Carpenter1-2/+2
The proc_symlink() function returns NULL on error, it doesn't return error pointers. Fixes: 5b86d4ff5dce ("afs: Implement network namespacing") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/YLjMRKX40pTrJvgf@mwanda/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-07afs: Fix partial writeback of large files on fsync and closeMarc Dionne1-1/+1
In commit e87b03f5830e ("afs: Prepare for use of THPs"), the return value for afs_write_back_from_locked_page was changed from a number of pages to a length in bytes. The loop in afs_writepages_region uses the return value to compute the index that will be used to find dirty pages in the next iteration, but treats it as a number of pages and wrongly multiplies it by PAGE_SIZE. This gives a very large index value, potentially skipping any dirty data that was not covered in the first pass, which is limited to 256M. This causes fsync(), and indirectly close(), to only do a partial writeback of a large file's dirty data. The rest is eventually written back by background threads after dirty_expire_centisecs. Fixes: e87b03f5830e ("afs: Prepare for use of THPs") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20210604175504.4055-1-marc.c.dionne@gmail.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-27afs: Fix the nlink handling of dir-over-dir renameDavid Howells1-1/+3
Fix rename of one directory over another such that the nlink on the deleted directory is cleared to 0 rather than being decremented to 1. This was causing the generic/035 xfstest to fail. Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/162194384460.3999479.7605572278074191079.stgit@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-25afs: Fix fall-through warnings for ClangGustavo A. R. Silva3-0/+10
In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple warnings by explicitly adding multiple fallthrough pseudo-keywords in places where the code is intended to fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-hardening@vger.kernel.org Link: https://lore.kernel.org/r/51150b54e0b0431a2c401cd54f2c4e7f50e94601.1605896059.git.gustavoars@kernel.org/ # v1 Link: https://lore.kernel.org/r/20210420211615.GA51432@embeddedor/ # v2 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-01afs: Fix speculative status fetchesDavid Howells6-2/+23
The generic/464 xfstest causes kAFS to emit occasional warnings of the form: kAFS: vnode modified {100055:8a} 30->31 YFS.StoreData64 (c=6015) This indicates that the data version received back from the server did not match the expected value (the DV should be incremented monotonically for each individual modification op committed to a vnode). What is happening is that a lookup call is doing a bulk status fetch speculatively on a bunch of vnodes in a directory besides getting the status of the vnode it's actually interested in. This is racing with a StoreData operation (though it could also occur with, say, a MakeDir op). On the client, a modification operation locks the vnode, but the bulk status fetch only locks the parent directory, so no ordering is imposed there (thereby avoiding an avenue to deadlock). On the server, the StoreData op handler doesn't lock the vnode until it's received all the request data, and downgrades the lock after committing the data until it has finished sending change notifications to other clients - which allows the status fetch to occur before it has finished. This means that: - a status fetch can access the target vnode either side of the exclusive section of the modification - the status fetch could start before the modification, yet finish after, and vice-versa. - the status fetch and the modification RPCs can complete in either order. - the status fetch can return either the before or the after DV from the modification. - the status fetch might regress the locally cached DV. Some of these are handled by the previous fix[1], but that's not sufficient because it checks the DV it received against the DV it cached at the start of the op, but the DV might've been updated in the meantime by a locally generated modification op. Fix this by the following means: (1) Keep track of when we're performing a modification operation on a vnode. This is done by marking vnode parameters with a 'modification' note that causes the AFS_VNODE_MODIFYING flag to be set on the vnode for the duration. (2) Alter the speculation race detection to ignore speculative status fetches if either the vnode is marked as being modified or the data version number is not what we expected. Note that whilst the "vnode modified" warning does get recovered from as it causes the client to refetch the status at the next opportunity, it will also invalidate the pagecache, so changes might get lost. Fixes: a9e5c87ca744 ("afs: Fix speculative status fetch going out of order wrt to modifications") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-and-reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/160605082531.252452.14708077925602709042.stgit@warthog.procyon.org.uk/ [1] Link: https://lore.kernel.org/linux-fsdevel/161961335926.39335.2552653972195467566.stgit@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-27Merge tag 'afs-netfs-lib-20210426' of ↵Linus Torvalds10-1009/+767
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS updates from David Howells: "Use the new netfs lib. Begin the process of overhauling the use of the fscache API by AFS and the introduction of support for features such as Transparent Huge Pages (THPs). - Add some support for THPs, including using core VM helper functions to find details of pages. - Use the ITER_XARRAY I/O iterator to mediate access to the pagecache as this handles THPs and doesn't require allocation of large bvec arrays. - Delegate address_space read/pre-write I/O methods for AFS to the netfs helper library. A method is provided to the library that allows it to issue a read against the server. This includes a change in use for PG_fscache (it now indicates a DIO write in progress from the marked page), so a number of waits need to be deployed for it. - Split the core AFS writeback function to make it easier to modify in future patches to handle writing to the cache. [This might feasibly make more sense moved out into my fscache-iter branch]. I've tested these with "xfstests -g quick" against an AFS volume (xfstests needs patching to make it work). With this, AFS without a cache passes all expected xfstests; with a cache, there's an extra failure, but that's also there before these patches. Fixing that probably requires a greater overhaul (as can be found on my fscache-iter branch, but that's for a later time). Thanks should go to Marc Dionne and Jeff Altman of AuriStor for exercising the patches in their test farm also" Link: https://lore.kernel.org/lkml/3785063.1619482429@warthog.procyon.org.uk/ * tag 'afs-netfs-lib-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Use the netfs_write_begin() helper afs: Use new netfs lib read helper API afs: Use the fs operation ops to handle FetchData completion afs: Prepare for use of THPs afs: Extract writeback extension into its own function afs: Wait on PG_fscache before modifying/releasing a page afs: Use ITER_XARRAY for writing afs: Set up the iov_iter before calling afs_extract_data() afs: Log remote unmarshalling errors afs: Don't truncate iter during data fetch afs: Move key to afs_read struct afs: Print the operation debug_id when logging an unexpected data version afs: Pass page into dirty region helpers to provide THP size afs: Disable use of the fscache I/O routines
2021-04-27Merge branch 'work.inode-type-fixes' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs inode type handling updates from Al Viro: "We should never change the type bits of ->i_mode or the method tables (->i_op and ->i_fop) of a live inode. Unfortunately, not all filesystems took care to prevent that" * 'work.inode-type-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: spufs: fix bogosity in S_ISGID handling 9p: missing chunk of "fs/9p: Don't update file type when updating file attributes" openpromfs: don't do unlock_new_inode() until the new inode is set up hostfs_mknod(): don't bother with init_special_inode() cifs: have cifs_fattr_to_inode() refuse to change type on live inode cifs: have ->mkdir() handle race with another client sanely do_cifs_create(): don't set ->i_mode of something we had not created gfs2: be careful with inode refresh ocfs2_inode_lock_update(): make sure we don't change the type bits of i_mode orangefs_inode_is_stale(): i_mode type bits do *not* form a bitmap... vboxsf: don't allow to change the inode type afs: Fix updating of i_mode due to 3rd party change ceph: don't allow type or device number to change on non-I_NEW inodes ceph: fix up error handling with snapdirs new helper: inode_wrong_type()
2021-04-23afs: Use the netfs_write_begin() helperDavid Howells3-97/+31
Make AFS use the new netfs_write_begin() helper to do the pre-reading required before the write. If successful, the helper returns with the required page filled in and locked. It may read more than just one page, expanding the read to meet cache granularity requirements as necessary. Note: A more advanced version of this could be made that does generic_perform_write() for a whole cache granule. This would make it easier to avoid doing the download/read for the data to be overwritten. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/160588546422.3465195.1546354372589291098.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161539563244.286939.16537296241609909980.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653819291.2770958.406013201547420544.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789102743.6155.17396591236631761195.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Use new netfs lib read helper APIDavid Howells5-251/+88
Make AFS use the new netfs read helpers to implement the VM read operations: - afs_readpage() now hands off responsibility to netfs_readpage(). - afs_readpages() is gone and replaced with afs_readahead(). - afs_readahead() just hands off responsibility to netfs_readahead(). These make use of the cache if a cookie is supplied, otherwise just call the ->issue_op() method a sufficient number of times to complete the entire request. Changes: v5: - Use proper wait function for PG_fscache in afs_page_mkwrite()[1]. - Use killable wait for PG_writeback in afs_page_mkwrite()[1]. v4: - Folded in error handling fixes to afs_req_issue_op(). - Added flag to netfs_subreq_terminated() to indicate that the caller may have been running async and stuff that might sleep needs punting to a workqueue. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/2499407.1616505440@warthog.procyon.org.uk [1] Link: https://lore.kernel.org/r/160588542733.3465195.7526541422073350302.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118158436.1232039.3884845981224091996.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161053540.2537118.14904446369309535330.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340418739.1303470.5908092911600241280.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539561926.286939.5729036262354802339.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653817977.2770958.17696456811587237197.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789101258.6155.3879271028895121537.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Use the fs operation ops to handle FetchData completionDavid Howells5-7/+19
Use the 'success' and 'aborted' afs_operations_ops methods and add a 'failed' method to handle the completion of an AFS.FetchData, AFS.FetchData64 or YFS.FetchData64 RPC operation rather than directly calling the done func pointed to by the afs_read struct from the call delivery handler. This means the done function will be called back on error also, not just on successful completion. This allows motion towards asynchronous data reception on data fetch calls and allows any error to be handed off to the fscache read helper in the same place as a successful completion. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/160588541471.3465195.8807019223378490810.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118157260.1232039.6549085372718234792.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161052647.2537118.12922380836599003659.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340417106.1303470.3502017303898569631.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539560673.286939.391310781674212229.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653816367.2770958.5856904574822446404.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789099994.6155.473719823490561190.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Prepare for use of THPsDavid Howells4-202/+244
As a prelude to supporting transparent huge pages, use thp_size() and similar rather than PAGE_SIZE/SHIFT. Further, try and frame everything in terms of file positions and lengths rather than page indices and numbers of pages. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/160588540227.3465195.4752143929716269062.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118155821.1232039.540445038028845740.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161051439.2537118.15577827510426326534.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340415869.1303470.6040191748634322355.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539559365.286939.18344613540296085269.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653815142.2770958.454490670311230206.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789098713.6155.16394227991842480300.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Extract writeback extension into its own functionDavid Howells1-42/+67
Extract writeback extension into its own function to break up the writeback function a bit. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/160588538471.3465195.782513375683399583.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118154610.1232039.1765365632920504822.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161050546.2537118.2202554806419189453.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340414102.1303470.9078891484034668985.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539558417.286939.2879469588895925399.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653813972.2770958.12671731209438112378.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789097132.6155.4916609419912731964.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Wait on PG_fscache before modifying/releasing a pageDavid Howells2-0/+19
PG_fscache is going to be used to indicate that a page is being written to the cache, and that the page should not be modified or released until it's finished. Make afs_invalidatepage() and afs_releasepage() wait for it. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/158861253957.340223.7465334678444521655.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/159465832417.1377938.3571599385208729791.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/160588536286.3465195.13231895135369807920.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118153708.1232039.3535103645871176749.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161049369.2537118.11591934943429117060.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340412903.1303470.6424701655031380012.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539556890.286939.5873470593519458598.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653812726.2770958.18167145829938766503.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789096241.6155.5907241930823579235.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Use ITER_XARRAY for writingDavid Howells5-186/+107
Use a single ITER_XARRAY iterator to describe the portion of a file to be transmitted to the server rather than generating a series of small ITER_BVEC iterators on the fly. This will make it easier to implement AIO in afs. In theory we could maybe use one giant ITER_BVEC, but that means potentially allocating a huge array of bio_vec structs (max 256 per page) when in fact the pagecache already has a structure listing all the relevant pages (radix_tree/xarray) that can be walked over. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/153685395197.14766.16289516750731233933.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/158861251312.340223.17924900795425422532.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/159465828607.1377938.6903132788463419368.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/160588535018.3465195.14509994354240338307.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118152415.1232039.6452879415814850025.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161048194.2537118.13763612220937637316.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340411602.1303470.4661108879482218408.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539555629.286939.5241869986617154517.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653811456.2770958.7017388543246759245.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789095005.6155.6789055030327407928.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Set up the iov_iter before calling afs_extract_data()David Howells6-249/+314
afs_extract_data() sets up a temporary iov_iter and passes it to AF_RXRPC each time it is called to describe the remaining buffer to be filled. Instead: (1) Put an iterator in the afs_call struct. (2) Set the iterator for each marshalling stage to load data into the appropriate places. A number of convenience functions are provided to this end (eg. afs_extract_to_buf()). This iterator is then passed to afs_extract_data(). (3) Use the new ITER_XARRAY iterator when reading data to load directly into the inode's pages without needing to create a list of them. This will allow O_DIRECT calls to be supported in future patches. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/152898380012.11616.12094591785228251717.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/153685394431.14766.3178466345696987059.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/153999787395.866.11218209749223643998.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/154033911195.12041.3882700371848894587.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/158861250059.340223.1248231474865140653.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/159465827399.1377938.11181327349704960046.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/160588533776.3465195.3612752083351956948.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118151238.1232039.17015723405750601161.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161047240.2537118.14721975104810564022.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340410333.1303470.16260122230371140878.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539554187.286939.15305559004905459852.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653810525.2770958.4630666029125411789.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789093719.6155.7877160739235087723.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Log remote unmarshalling errorsDavid Howells1-0/+34
Log unmarshalling errors reported by the peer (ie. it can't parse what we sent it). Limit the maximum number of messages to 3. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/159465826250.1377938.16372395422217583913.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/160588532584.3465195.15618385466614028590.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118149739.1232039.208060911149801695.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161046033.2537118.7779717661044373273.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340409118.1303470.17812607349396199116.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539552964.286939.16503232687974398308.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653808989.2770958.11530765353025697860.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789092349.6155.8581594259882708631.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Don't truncate iter during data fetchDavid Howells4-8/+23
Don't truncate the iterator to correspond to the actual data size when fetching the data from the server - rather, pass the length we want to read to rxrpc. This will allow the clear-after-read code in future to simply clear the remaining iterator capacity rather than having to reinitialise the iterator. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/158861249201.340223.13035445866976590375.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/159465825061.1377938.14403904452300909320.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/160588531418.3465195.10712005940763063144.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118148567.1232039.13380313332292947956.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161044610.2537118.17908520793806837792.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340407907.1303470.6501394859511712746.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539551721.286939.14655713136572200716.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653807790.2770958.14034599989374173734.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789090823.6155.15673999934535049102.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Move key to afs_read structDavid Howells4-15/+19
Stash the key used to authenticate read operations in the afs_read struct. This will be necessary to reissue the operation against the server if a read from the cache fails in upcoming cache changes. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/158861248336.340223.1851189950710196001.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/159465823899.1377938.11925978022348532049.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/160588529557.3465195.7303323479305254243.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118147693.1232039.13780672951838643842.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161043340.2537118.511899217704140722.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340406678.1303470.12676824086429446370.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539550819.286939.1268332875889175195.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653806683.2770958.11300984379283401542.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789089556.6155.14603302893431820997.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Print the operation debug_id when logging an unexpected data versionDavid Howells1-2/+3
Print the afs_operation debug_id when logging an unexpected change in the data version. This allows the logged message to be matched against tracelines. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/160588528377.3465195.2206051235095182302.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118146111.1232039.11398082422487058312.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161042180.2537118.2471333561661033316.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340405772.1303470.3877167548944248214.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539549628.286939.15234870409714613954.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653805530.2770958.15120507632529970934.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789088290.6155.3494369629853673866.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Pass page into dirty region helpers to provide THP sizeDavid Howells3-54/+42
Pass a pointer to the page being accessed into the dirty region helpers so that the size of the page can be determined in case it's a transparent huge page. This also required the page to be passed into the afs_page_dirty trace point - so there's no need to specifically pass in the index or private data as these can be retrieved directly from the page struct. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/160588527183.3465195.16107942526481976308.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118144921.1232039.11377711180492625929.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161040747.2537118.11435394902674511430.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340404553.1303470.11414163641767769882.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539548385.286939.8864598314493255313.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653804285.2770958.3497360004849598038.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789087043.6155.16922142208140170528.stgit@warthog.procyon.org.uk/ # v6
2021-04-23afs: Disable use of the fscache I/O routinesDavid Howells3-175/+36
Disable use of the fscache I/O routined by the AFS filesystem. It's about to transition to passing iov_iters down and fscache is about to have its I/O path to use iov_iter, so all that needs to change. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/158861209824.340223.1864211542341758994.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/159465768717.1376105.2229314852486665807.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/160588457929.3465195.1730097418904945578.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118143744.1232039.2727898205333669064.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161039077.2537118.7986870854927176905.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340403323.1303470.8159439948319423431.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539547167.286939.3536238932531122332.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653802797.2770958.547311814861545911.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789085806.6155.2596146255056027428.stgit@warthog.procyon.org.uk/ # v6
2021-03-23afs: Use wait_on_page_writeback_killableMatthew Wilcox (Oracle)1-2/+1
Open-coding this function meant it missed out on the recent bugfix for waiters being woken by a delayed wake event from a previous instantiation of the page[1]. [DH: Changed the patch to use vmf->page rather than variable page which doesn't exist yet upstream] Fixes: 1cf7a1518aef ("afs: Implement shared-writeable mmap") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing@auristor.com cc: linux-afs@lists.infradead.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20210320054104.1300774-4-willy@infradead.org Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c2407cf7d22d0c0d94cf20342b3b8f06f1d904e7 [1]
2021-03-15afs: Stop listxattr() from listing "afs.*" attributesDavid Howells6-28/+0
afs_listxattr() lists all the available special afs xattrs (i.e. those in the "afs.*" space), no matter what type of server we're dealing with. But OpenAFS servers, for example, cannot deal with some of the extra-capable attributes that AuriStor (YFS) servers provide. Unfortunately, the presence of the afs.yfs.* attributes causes errors[1] for anything that tries to read them if the server is of the wrong type. Fix the problem by removing afs_listxattr() so that none of the special xattrs are listed (AFS doesn't support xattrs). It does mean, however, that getfattr won't list them, though they can still be accessed with getxattr() and setxattr(). This can be tested with something like: getfattr -d -m ".*" /afs/example.com/path/to/file With this change, none of the afs.* attributes should be visible. Changes: ver #2: - Hide all of the afs.* xattrs, not just the ACL ones. Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs") Reported-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003502.html [1] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003567.html # v1 Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003573.html # v2
2021-03-15afs: Fix accessing YFS xattrs on a non-YFS serverDavid Howells2-3/+12
If someone attempts to access YFS-related xattrs (e.g. afs.yfs.acl) on a file on a non-YFS AFS server (such as OpenAFS), then the kernel will jump to a NULL function pointer because the afs_fetch_acl_operation descriptor doesn't point to a function for issuing an operation on a non-YFS server[1]. Fix this by making afs_wait_for_operation() check that the issue_afs_rpc method is set before jumping to it and setting -ENOTSUPP if not. This fix also covers other potential operations that also only exist on YFS servers. afs_xattr_get/set_yfs() then need to translate -ENOTSUPP to -ENODATA as the former error is internal to the kernel. The bug shows up as an oops like the following: BUG: kernel NULL pointer dereference, address: 0000000000000000 [...] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [...] Call Trace: afs_wait_for_operation+0x83/0x1b0 [kafs] afs_xattr_get_yfs+0xe6/0x270 [kafs] __vfs_getxattr+0x59/0x80 vfs_getxattr+0x11c/0x140 getxattr+0x181/0x250 ? __check_object_size+0x13f/0x150 ? __fput+0x16d/0x250 __x64_sys_fgetxattr+0x64/0xb0 do_syscall_64+0x49/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fb120a9defe This was triggered with "cp -a" which attempts to copy xattrs, including afs ones, but is easier to reproduce with getfattr, e.g.: getfattr -d -m ".*" /afs/openafs.org/ Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Reported-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003498.html [1] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003566.html # v1 Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003572.html # v2