Age | Commit message (Collapse) | Author | Files | Lines |
|
Keep void * instead, allocate on demand (in parse_str_opts, at the
moment). Eventually both selinux and smack will be better off
with private structures with several strings in those, rather than
this "counter and two pointers to dynamically allocated arrays"
ugliness. This commit allows to do that at leisure, without
disrupting anything outside of given module.
Changes:
* instead of struct security_mnt_opt use an opaque pointer
initialized to NULL.
* security_sb_eat_lsm_opts(), security_sb_parse_opts_str() and
security_free_mnt_opts() take it as var argument (i.e. as void **);
call sites are unchanged.
* security_sb_set_mnt_opts() and security_sb_remount() take
it by value (i.e. as void *).
* new method: ->sb_free_mnt_opts(). Takes void *, does
whatever freeing that needs to be done.
* ->sb_set_mnt_opts() and ->sb_remount() might get NULL as
mnt_opts argument, meaning "empty".
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
* if mount(2) passes something like "context=foo" with MS_REMOUNT
in flags (/sbin/mount.nfs will _not_ do that - you need to issue
the syscall manually), you'll get leaked copies for LSM options.
The reason is that instead of nfs_{alloc,free}_parsed_mount_data()
nfs_remount() uses kzalloc/kfree, which lacks the needed cleanup.
* selinux options are not changed on remount (as for any other
fs), but in case of NFS the failure is quiet - they are not compared
to what we used to have, with complaint in case of attempted changes.
Trivially fixed by converting to use of security_sb_remount().
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
1) keeping a copy in btrfs_fs_info is completely pointless - we never
use it for anything. Getting rid of that allows for simpler calling
conventions for setup_security_options() (caller is responsible for
freeing mnt_opts in all cases).
2) on remount we want to use ->sb_remount(), not ->sb_set_mnt_opts(),
same as we would if not for FS_BINARY_MOUNTDATA. Behaviours *are*
close (in fact, selinux sb_set_mnt_opts() ought to punt to
sb_remount() in "already initialized" case), but let's handle
that uniformly. And the only reason why the original btrfs changes
didn't go for security_sb_remount() in btrfs_remount() case is that
it hadn't been exported. Let's export it for a while - it'll be
going away soon anyway.
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... leaving the "is it kernel-internal" logics in the caller.
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
combination of alloc_secdata(), security_sb_copy_data(),
security_sb_parse_opt_str() and free_secdata().
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
This paves the way for retaining the LSM options from a common filesystem
mount context during a mount parameter parsing phase to be instituted prior
to actual mount/reconfiguration actions.
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
This paves the way for retaining the LSM options from a common filesystem
mount context during a mount parameter parsing phase to be instituted prior
to actual mount/reconfiguration actions.
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Separate just the changing of mount flags (MS_REMOUNT|MS_BIND) from full
remount because the mount data will get parsed with the new fs_context
stuff prior to doing a remount - and this causes the syscall to fail under
some circumstances.
To quote Eric's explanation:
[...] mount(..., MS_REMOUNT|MS_BIND, ...) now validates the mount options
string, which breaks systemd unit files with ProtectControlGroups=yes
(e.g. systemd-networkd.service) when systemd does the following to
change a cgroup (v1) mount to read-only:
mount(NULL, "/run/systemd/unit-root/sys/fs/cgroup/systemd", NULL,
MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_REMOUNT|MS_BIND, NULL)
... when the kernel has CONFIG_CGROUPS=y but no cgroup subsystems
enabled, since in that case the error "cgroup1: Need name or subsystem
set" is hit when the mount options string is empty.
Probably it doesn't make sense to validate the mount options string at
all in the MS_REMOUNT|MS_BIND case, though maybe you had something else
in mind.
This is also worthwhile doing because we will need to add a mount_setattr()
syscall to take over the remount-bind function.
Reported-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: David Howells <dhowells@redhat.com>
|
|
Only the mount namespace code that implements mount(2) should be using the
MS_* flags. Suppress them inside the kernel unless uapi/linux/mount.h is
included.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: David Howells <dhowells@redhat.com>
|
|
... and don't abuse mount_nodev(), while we are at it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: David Howells <dhowells@redhat.com>
|
|
Pull UBIFS updates from Richard Weinberger:
- Full filesystem authentication feature, UBIFS is now able to have the
whole filesystem structure authenticated plus user data encrypted and
authenticated.
- Minor cleanups
* tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifs: (26 commits)
ubifs: Remove unneeded semicolon
Documentation: ubifs: Add authentication whitepaper
ubifs: Enable authentication support
ubifs: Do not update inode size in-place in authenticated mode
ubifs: Add hashes and HMACs to default filesystem
ubifs: authentication: Authenticate super block node
ubifs: Create hash for default LPT
ubfis: authentication: Authenticate master node
ubifs: authentication: Authenticate LPT
ubifs: Authenticate replayed journal
ubifs: Add auth nodes to garbage collector journal head
ubifs: Add authentication nodes to journal
ubifs: authentication: Add hashes to index nodes
ubifs: Add hashes to the tree node cache
ubifs: Create functions to embed a HMAC in a node
ubifs: Add helper functions for authentication support
ubifs: Add separate functions to init/crc a node
ubifs: Format changes for authentication support
ubifs: Store read superblock node
ubifs: Drop write_node
...
|
|
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
Bugfix:
- Fix build issues on architectures that don't provide 64-bit cmpxchg
Cleanups:
- Fix a spelling mistake"
* tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: fix spelling mistake, EACCESS -> EACCES
SUNRPC: Use atomic(64)_t for seq_send(64)
|
|
Pull cifs fixes and updates from Steve French:
"Three small fixes (one Kerberos related, one for stable, and another
fixes an oops in xfstest 377), two helpful debugging improvements,
three patches for cifs directio and some minor cleanup"
* tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix signed/unsigned mismatch on aio_read patch
cifs: don't dereference smb_file_target before null check
CIFS: Add direct I/O functions to file_operations
CIFS: Add support for direct I/O write
CIFS: Add support for direct I/O read
smb3: missing defines and structs for reparse point handling
smb3: allow more detailed protocol info on open files for debugging
smb3: on kerberos mount if server doesn't specify auth type use krb5
smb3: add trace point for tree connection
cifs: fix spelling mistake, EACCESS -> EACCES
cifs: fix return value for cifs_listxattr
|
|
syzbot is reporting too large memory allocation at bfs_fill_super() [1].
Since file system image is corrupted such that bfs_sb->s_start == 0,
bfs_fill_super() is trying to allocate 8MB of continuous memory. Fix
this by adding a sanity check on bfs_sb->s_start, __GFP_NOWARN and
printf().
[1] https://syzkaller.appspot.com/bug?id=16a87c236b951351374a84c8a32f40edbc034e96
Link: http://lkml.kernel.org/r/1525862104-3407-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+71c6b5d68e91149fc8a4@syzkaller.appspotmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Tigran Aivazian <aivazian.tigran@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
ocfs2_defrag_extent() might leak allocated clusters. When the file
system has insufficient space, the number of claimed clusters might be
less than the caller wants. If that happens, the original code might
directly commit the transaction without returning clusters.
This patch is based on code in ocfs2_add_clusters_in_btree().
[akpm@linux-foundation.org: include localalloc.h, reduce scope of data_ac]
Link: http://lkml.kernel.org/r/20180904041621.16874-3-lchen@suse.com
Signed-off-by: Larry Chen <lchen@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The handling of timestamps outside of the 1970..2038 range in the dlm
glue is rather inconsistent: on 32-bit architectures, this has always
wrapped around to negative timestamps in the 1902..1969 range, while on
64-bit kernels all timestamps are interpreted as positive 34 bit numbers
in the 1970..2514 year range.
Now that the VFS code handles 64-bit timestamps on all architectures, we
can make the behavior more consistent here, and return the same result
that we had on 64-bit already, making the file system y2038 safe in the
process. Outside of dlmglue, it already uses 64-bit on-disk timestamps
anway, so that part is fine.
For consistency, I'm changing ocfs2_pack_timespec() to clamp anything
outside of the supported range to the minimum and maximum values. This
avoids a possible ambiguity of values before 1970 in particular, which
used to be interpreted as times at the end of the 2514 range previously.
Link: http://lkml.kernel.org/r/20180619155826.4106487-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
ocfs2_read_blocks() and ocfs2_read_blocks_sync() are both used to read
several blocks from disk. Currently, the input argument *bhs* can be
NULL or NOT. It depends on the caller's behavior. If the function
fails in reading blocks from disk, the corresponding bh will be assigned
to NULL and put.
Obviously, above process for non-NULL input bh is not appropriate.
Because the caller doesn't even know its bhs are put and re-assigned.
If buffer head is managed by caller, ocfs2_read_blocks and
ocfs2_read_blocks_sync() should not evaluate it to NULL. It will cause
caller accessing illegal memory, thus crash.
Link: http://lkml.kernel.org/r/HK2PR06MB045285E0F4FBB561F9F2F9B3D5680@HK2PR06MB0452.apcprd06.prod.outlook.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reviewed-by: Guozhonghua <guozhonghua@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Somehow, file system metadata was corrupted, which causes
ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el().
According to the original design intention, if above happens we should
skip the problematic block and continue to retrieve dir entry. But
there is obviouse misuse of brelse around related code.
After failure of ocfs2_check_dir_entry(), current code just moves to
next position and uses the problematic buffer head again and again
during which the problematic buffer head is released for multiple times.
I suppose, this a serious issue which is long-lived in ocfs2. This may
cause other file systems which is also used in a the same host insane.
So we should also consider about bakcporting this patch into linux
-stable.
Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB0452.apcprd06.prod.outlook.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Suggested-by: Changkuo Shi <shi.changkuo@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When -EIOCBQUEUED returns, it means that aio_complete() will be called
from dio_complete(), which is an asynchronous progress against
write_iter. Generally, IO is a very slow progress than executing
instruction, but we still can't take the risk to access a freed iocb.
And we do face a BUG crash issue. Using the crash tool, iocb is
obviously freed already.
crash> struct -x kiocb ffff881a350f5900
struct kiocb {
ki_filp = 0xffff881a350f5a80,
ki_pos = 0x0,
ki_complete = 0x0,
private = 0x0,
ki_flags = 0x0
}
And the backtrace shows:
ocfs2_file_write_iter+0xcaa/0xd00 [ocfs2]
aio_run_iocb+0x229/0x2f0
do_io_submit+0x291/0x540
SyS_io_submit+0x10/0x20
system_call_fastpath+0x16/0x75
Link: http://lkml.kernel.org/r/1523361653-14439-1-git-send-email-ge.changwei@h3c.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
During one dead node's recovery by other node, quota recovery work will
be queued. We should avoid calling quota when it is not supported, so
check the quota flags.
Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA401071AC9FB@H3CMLB12-EX.srv.huawei-3com.com
Signed-off-by: guozhonghua <guozhonghua@h3c.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Remove ocfs2_is_o2cb_active(). We have similar functions to identify
which cluster stack is being used via osb->osb_cluster_stack.
Secondly, the current implementation of ocfs2_is_o2cb_active() is not
totally safe. Based on the design of stackglue, we need to get
ocfs2_stack_lock before using ocfs2_stack related data structures, and
that active_stack pointer can be NULL in the case of mount failure.
Link: http://lkml.kernel.org/r/1495441079-11708-1-git-send-email-ghe@suse.com
Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Reviewed-by: Eric Ren <zren@suse.com>
Acked-by: Changwei Ge <ge.changwei@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The patch "CIFS: Add support for direct I/O read" had
a signed/unsigned mismatch (ssize_t vs. size_t) in the
return from one function. Similar trivial change
in aio_write
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
|
|
There is a null check on dst_file->private data which suggests
it can be potentially null. However, before this check, pointer
smb_file_target is derived from dst_file->private and dereferenced
in the call to tlink_tcon, hence there is a potential null pointer
deference.
Fix this by assigning smb_file_target and target_tcon after the
null pointer sanity checks.
Detected by CoverityScan, CID#1475302 ("Dereference before null check")
Fixes: 04b38d601239 ("vfs: pull btrfs clone API to vfs layer")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
With direct read/write functions implemented, add them to file_operations.
Dircet I/O is used under two conditions:
1. When mounting with "cache=none", CIFS uses direct I/O for all user file
data transfer.
2. When opening a file with O_DIRECT, CIFS uses direct I/O for all data
transfer on this file.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
|
|
With direct I/O write, user supplied buffers are pinned to the memory and data
are transferred directly from user buffers to the transport layer.
Change in v3: add support for kernel AIO
Change in v4:
Refactor common write code to __cifs_writev for direct and non-direct I/O.
Retry on direct I/O failure.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
With direct I/O read, we transfer the data directly from transport layer to
the user data buffer.
Change in v3: add support for kernel AIO
Change in v4:
Refactor common read code to __cifs_readv for direct and non-direct I/O.
Retry on direct I/O failure.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
We were missing some structs from MS-FSCC relating to
reparse point handling. Add them to protocol defines
in smb2pdu.h
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
|
|
In order to debug complex problems it is often helpful to
have detailed information on the client and server view
of the open file information. Add the ability for root to
view the list of smb3 open files and dump the persistent
handle and other info so that it can be more easily
correlated with server logs.
Sample output from "cat /proc/fs/cifs/open_files"
# Version:1
# Format:
# <tree id> <persistent fid> <flags> <count> <pid> <uid> <filename> <mid>
0x5 0x800000378 0x8000 1 7704 0 some-file 0x14
0xcb903c0c 0x84412e67 0x8000 1 7754 1001 rofile 0x1a6d
0xcb903c0c 0x9526b767 0x8000 1 7720 1000 file 0x1a5b
0xcb903c0c 0x9ce41a21 0x8000 1 7715 0 smallfile 0xd67
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
|
|
Some servers (e.g. Azure) do not include a spnego blob in the SMB3
negotiate protocol response, so on kerberos mounts ("sec=krb5")
we can fail, as we expected the server to list its supported
auth types (OIDs in the spnego blob in the negprot response).
Change this so that on krb5 mounts we default to trying krb5 if the
server doesn't list its supported protocol mechanisms.
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
|
|
In debugging certain scenarios, especially reconnect cases,
it can be helpful to have a dynamic trace point for the
result of tree connect. See sample output below
from a reconnect event. The new event is 'smb3_tcon'
TASK-PID CPU# |||| TIMESTAMP FUNCTION
| | | |||| | |
cifsd-6071 [001] .... 2659.897923: smb3_reconnect: server=localhost current_mid=0xa
kworker/1:1-71 [001] .... 2666.026342: smb3_cmd_done: sid=0x0 tid=0x0 cmd=0 mid=0
kworker/1:1-71 [001] .... 2666.026576: smb3_cmd_err: sid=0xc49e1787 tid=0x0 cmd=1 mid=1 status=0xc0000016 rc=-5
kworker/1:1-71 [001] .... 2666.031677: smb3_cmd_done: sid=0xc49e1787 tid=0x0 cmd=1 mid=2
kworker/1:1-71 [001] .... 2666.031921: smb3_cmd_done: sid=0xc49e1787 tid=0x6e78f05f cmd=3 mid=3
kworker/1:1-71 [001] .... 2666.031923: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\test rc=0
kworker/1:1-71 [001] .... 2666.032097: smb3_cmd_done: sid=0xc49e1787 tid=0x6e78f05f cmd=11 mid=4
kworker/1:1-71 [001] .... 2666.032265: smb3_cmd_done: sid=0xc49e1787 tid=0x7912332f cmd=3 mid=5
kworker/1:1-71 [001] .... 2666.032266: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\IPC$ rc=0
kworker/1:1-71 [001] .... 2666.032386: smb3_cmd_done: sid=0xc49e1787 tid=0x7912332f cmd=11 mid=6
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
|
|
Trivial fix to a spelling mistake of the error access name EACCESS,
rename to EACCES
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
If the application buffer was too small to fit all the names
we would still count the number of bytes and return this for
listxattr. This would then trigger a BUG in usercopy.c
Fix the computation of the size so that we return -ERANGE
correctly when the buffer is too small.
This fixes the kernel BUG for xfstest generic/377
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
|
|
Pull block layer fixes from Jens Axboe:
"The biggest part of this pull request is the revert of the blkcg
cleanup series. It had one fix earlier for a stacked device issue, but
another one was reported. Rather than play whack-a-mole with this,
revert the entire series and try again for the next kernel release.
Apart from that, only small fixes/changes.
Summary:
- Indentation fixup for mtip32xx (Colin Ian King)
- The blkcg cleanup series revert (Dennis Zhou)
- Two NVMe fixes. One fixing a regression in the nvme request
initialization in this merge window, causing nvme-fc to not work.
The other is a suspend/resume p2p resource issue (James, Keith)
- Fix sg discard merge, allowing us to merge in cases where we didn't
before (Jianchao Wang)
- Call rq_qos_exit() after the queue is frozen, preventing a hang
(Ming)
- Fix brd queue setup, fixing an oops if we fail setting up all
devices (Ming)"
* tag 'for-linus-20181102' of git://git.kernel.dk/linux-block:
nvme-pci: fix conflicting p2p resource adds
nvme-fc: fix request private initialization
blkcg: revert blkcg cleanups series
block: brd: associate with queue until adding disk
block: call rq_qos_exit() after queue is frozen
mtip32xx: clean an indentation issue, remove extraneous tabs
block: fix the DISCARD request merge
|
|
Pull vfs dedup fixes from Dave Chinner:
"This reworks the vfs data cloning infrastructure.
We discovered many issues with these interfaces late in the 4.19 cycle
- the worst of them (data corruption, setuid stripping) were fixed for
XFS in 4.19-rc8, but a larger rework of the infrastructure fixing all
the problems was needed. That rework is the contents of this pull
request.
Rework the vfs_clone_file_range and vfs_dedupe_file_range
infrastructure to use a common .remap_file_range method and supply
generic bounds and sanity checking functions that are shared with the
data write path. The current VFS infrastructure has problems with
rlimit, LFS file sizes, file time stamps, maximum filesystem file
sizes, stripping setuid bits, etc and so they are addressed in these
commits.
We also introduce the ability for the ->remap_file_range methods to
return short clones so that clones for vfs_copy_file_range() don't get
rejected if the entire range can't be cloned. It also allows
filesystems to sliently skip deduplication of partial EOF blocks if
they are not capable of doing so without requiring errors to be thrown
to userspace.
Existing filesystems are converted to user the new remap_file_range
method, and both XFS and ocfs2 are modified to make use of the new
generic checking infrastructure"
* tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (28 commits)
xfs: remove [cm]time update from reflink calls
xfs: remove xfs_reflink_remap_range
xfs: remove redundant remap partial EOF block checks
xfs: support returning partial reflink results
xfs: clean up xfs_reflink_remap_blocks call site
xfs: fix pagecache truncation prior to reflink
ocfs2: remove ocfs2_reflink_remap_range
ocfs2: support partial clone range and dedupe range
ocfs2: fix pagecache truncation prior to reflink
ocfs2: truncate page cache for clone destination file before remapping
vfs: clean up generic_remap_file_range_prep return value
vfs: hide file range comparison function
vfs: enable remap callers that can handle short operations
vfs: plumb remap flags through the vfs dedupe functions
vfs: plumb remap flags through the vfs clone functions
vfs: make remap_file_range functions take and return bytes completed
vfs: remap helper should update destination inode metadata
vfs: pass remap flags to generic_remap_checks
vfs: pass remap flags to generic_remap_file_range_prep
vfs: combine the clone and dedupe into a single remap_file_range
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
"No common topic, really - a handful of assorted stuff; the least
trivial bits are Mark's dedupe patches"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs/exofs: only use true/false for asignment of bool type variable
fs/exofs: fix potential memory leak in mount option parsing
Delete invalid assignment statements in do_sendfile
iomap: remove duplicated include from iomap.c
vfs: dedupe should return EPERM if permission is not granted
vfs: allow dedupe of user owned read-only files
ntfs: don't open-code ERR_CAST
ext4: don't open-code ERR_CAST
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull AFS updates from Al Viro:
"AFS series, with some iov_iter bits included"
* 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
missing bits of "iov_iter: Separate type from direction and use accessor functions"
afs: Probe multiple fileservers simultaneously
afs: Fix callback handling
afs: Eliminate the address pointer from the address list cursor
afs: Allow dumping of server cursor on operation failure
afs: Implement YFS support in the fs client
afs: Expand data structure fields to support YFS
afs: Get the target vnode in afs_rmdir() and get a callback on it
afs: Calc callback expiry in op reply delivery
afs: Fix FS.FetchStatus delivery from updating wrong vnode
afs: Implement the YFS cache manager service
afs: Remove callback details from afs_callback_break struct
afs: Commit the status on a new file/dir/symlink
afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS
afs: Don't invoke the server to read data beyond EOF
afs: Add a couple of tracepoints to log I/O errors
afs: Handle EIO from delivery function
afs: Fix TTL on VL server and address lists
afs: Implement VL server rotation
afs: Improve FS server rotation error handling
...
|
|
This reverts a series committed earlier due to null pointer exception
bug report in [1]. It seems there are edge case interactions that I did
not consider and will need some time to understand what causes the
adverse interactions.
The original series can be found in [2] with a follow up series in [3].
[1] https://www.spinics.net/lists/cgroups/msg20719.html
[2] https://lore.kernel.org/lkml/20180911184137.35897-1-dennisszhou@gmail.com/
[3] https://lore.kernel.org/lkml/20181020185612.51587-1-dennis@kernel.org/
This reverts the following commits:
d459d853c2ed, b2c3fa546705, 101246ec02b5, b3b9f24f5fcc, e2b0989954ae,
f0fcb3ec89f3, c839e7a03f92, bdc2491708c4, 74b7c02a9bc1, 5bf9a1f3b4ef,
a7b39b4e961c, 07b05bcc3213, 49f4c2dc2b50, 27e6fa996c53
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
https://github.com/ojeda/linux
Pull compiler attribute updates from Miguel Ojeda:
"This is an effort to disentangle the include/linux/compiler*.h headers
and bring them up to date.
The main idea behind the series is to use feature checking macros
(i.e. __has_attribute) instead of compiler version checks (e.g.
GCC_VERSION), which are compiler-agnostic (so they can be shared,
reducing the size of compiler-specific headers) and version-agnostic.
Other related improvements have been performed in the headers as well,
which on top of the use of __has_attribute it has amounted to a
significant simplification of these headers (e.g. GCC_VERSION is now
only guarding a few non-attribute macros).
This series should also help the efforts to support compiling the
kernel with clang and icc. A fair amount of documentation and comments
have also been added, clarified or removed; and the headers are now
more readable, which should help kernel developers in general.
The series was triggered due to the move to gcc >= 4.6. In turn, this
series has also triggered Sparse to gain the ability to recognize
__has_attribute on its own.
Finally, the __nonstring variable attribute series has been also
applied on top; plus two related patches from Nick Desaulniers for
unreachable() that came a bit afterwards"
* tag 'compiler-attributes-for-linus-4.20-rc1' of https://github.com/ojeda/linux:
compiler-gcc: remove comment about gcc 4.5 from unreachable()
compiler.h: update definition of unreachable()
Compiler Attributes: ext4: remove local __nonstring definition
Compiler Attributes: auxdisplay: panel: use __nonstring
Compiler Attributes: enable -Wstringop-truncation on W=1 (gcc >= 8)
Compiler Attributes: add support for __nonstring (gcc >= 8)
Compiler Attributes: add MAINTAINERS entry
Compiler Attributes: add Doc/process/programming-language.rst
Compiler Attributes: remove uses of __attribute__ from compiler.h
Compiler Attributes: KENTRY used twice the "used" attribute
Compiler Attributes: use feature checks instead of version checks
Compiler Attributes: add missing SPDX ID in compiler_types.h
Compiler Attributes: remove unneeded sparse (__CHECKER__) tests
Compiler Attributes: homogenize __must_be_array
Compiler Attributes: remove unneeded tests
Compiler Attributes: always use the extra-underscores syntax
Compiler Attributes: remove unused attributes
|
|
backmerge to do fixup of iov_iter_kvec() conflict
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
"A mix of fixes and cleanups"
* tag 'ovl-update-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: automatically enable redirect_dir on metacopy=on
ovl: check whiteout in ovl_create_over_whiteout()
ovl: using posix_acl_xattr_size() to get size instead of posix_acl_to_xattr()
ovl: abstract ovl_inode lock with a helper
ovl: remove the 'locked' argument of ovl_nlink_{start,end}
ovl: relax requirement for non null uuid of lower fs
ovl: fold copy-up helpers into callers
ovl: untangle copy up call chain
ovl: relax permission checking on underlying layers
ovl: fix recursive oi->lock in ovl_link()
vfs: fix FIGETBSZ ioctl on an overlayfs file
ovl: clean up error handling in ovl_get_tmpfile()
ovl: fix error handling in ovl_verify_set_fh()
|
|
Current behavior is to automatically disable metacopy if redirect_dir is
not enabled and proceed with the mount.
If "metacopy=on" mount option was given, then this behavior can confuse the
user: no mount failure, yet metacopy is disabled.
This patch makes metacopy=on imply redirect_dir=on.
The converse is also true: turning off full redirect with redirect_dir=
{off|follow|nofollow} will disable metacopy.
If both metacopy=on and redirect_dir={off|follow|nofollow} is specified,
then mount will fail, since there's no way to correctly resolve the
conflict.
Reported-by: Daniel Walsh <dwalsh@redhat.com>
Fixes: d5791044d2e5 ("ovl: Provide a mount option metacopy=on/off...")
Cc: <stable@vger.kernel.org> # v4.19
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull stackleak gcc plugin from Kees Cook:
"Please pull this new GCC plugin, stackleak, for v4.20-rc1. This plugin
was ported from grsecurity by Alexander Popov. It provides efficient
stack content poisoning at syscall exit. This creates a defense
against at least two classes of flaws:
- Uninitialized stack usage. (We continue to work on improving the
compiler to do this in other ways: e.g. unconditional zero init was
proposed to GCC and Clang, and more plugin work has started too).
- Stack content exposure. By greatly reducing the lifetime of valid
stack contents, exposures via either direct read bugs or unknown
cache side-channels become much more difficult to exploit. This
complements the existing buddy and heap poisoning options, but
provides the coverage for stacks.
The x86 hooks are included in this series (which have been reviewed by
Ingo, Dave Hansen, and Thomas Gleixner). The arm64 hooks have already
been merged through the arm64 tree (written by Laura Abbott and
reviewed by Mark Rutland and Will Deacon).
With VLAs having been removed this release, there is no need for
alloca() protection, so it has been removed from the plugin"
* tag 'stackleak-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
arm64: Drop unneeded stackleak_check_alloca()
stackleak: Allow runtime disabling of kernel stack erasing
doc: self-protection: Add information about STACKLEAK feature
fs/proc: Show STACKLEAK metrics in the /proc file system
lkdtm: Add a test for STACKLEAK
gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
|
|
Trivial fix to a spelling mistake of the error access name EACCESS,
rename to EACCES
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
"As well as the usual bug fixes, this adds the following new features:
- cached readdir and readlink
- max I/O size increased from 128k to 1M
- improved performance and scalability of request queues
- copy_file_range support
The only non-fuse bits are trivial cleanups of macros in
<linux/bitops.h>"
* tag 'fuse-update-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (31 commits)
fuse: enable caching of symlinks
fuse: only invalidate atime in direct read
fuse: don't need GETATTR after every READ
fuse: allow fine grained attr cache invaldation
bitops: protect variables in bit_clear_unless() macro
bitops: protect variables in set_mask_bits() macro
fuse: realloc page array
fuse: add max_pages to init_out
fuse: allocate page array more efficiently
fuse: reduce size of struct fuse_inode
fuse: use iversion for readdir cache verification
fuse: use mtime for readdir cache verification
fuse: add readdir cache version
fuse: allow using readdir cache
fuse: allow caching readdir
fuse: extract fuse_emit() helper
fuse: add FOPEN_CACHE_DIR
fuse: split out readdir.c
fuse: Use hash table to link processing request
fuse: kill req->intr_unique
...
|
|
Pull ceph updates from Ilya Dryomov:
"The highlights are:
- a series that fixes some old memory allocation issues in libceph
(myself). We no longer allocate memory in places where allocation
failures cannot be handled and BUG when the allocation fails.
- support for copy_file_range() syscall (Luis Henriques). If size and
alignment conditions are met, it leverages RADOS copy-from
operation. Otherwise, a local copy is performed.
- a patch that reduces memory requirement of ceph_sync_read() from
the size of the entire read to the size of one object (Zheng Yan).
- fallocate() syscall is now restricted to FALLOC_FL_PUNCH_HOLE (Luis
Henriques)"
* tag 'ceph-for-4.20-rc1' of git://github.com/ceph/ceph-client: (25 commits)
ceph: new mount option to disable usage of copy-from op
ceph: support copy_file_range file operation
libceph: support the RADOS copy-from operation
ceph: add non-blocking parameter to ceph_try_get_caps()
libceph: check reply num_data_items in setup_request_data()
libceph: preallocate message data items
libceph, rbd, ceph: move ceph_osdc_alloc_messages() calls
libceph: introduce alloc_watch_request()
libceph: assign cookies in linger_submit()
libceph: enable fallback to ceph_msg_new() in ceph_msgpool_get()
ceph: num_ops is off by one in ceph_aio_retry_work()
libceph: no need to call osd_req_opcode_valid() in osd_req_encode_op()
ceph: set timeout conditionally in __cap_delay_requeue
libceph: don't consume a ref on pagelist in ceph_msg_data_add_pagelist()
libceph: introduce ceph_pagelist_alloc()
libceph: osd_req_op_cls_init() doesn't need to take opcode
libceph: bump CEPH_MSG_MAX_DATA_LEN
ceph: only allow punch hole mode in fallocate
ceph: refactor ceph_sync_read()
ceph: check if LOOKUPNAME request was aborted when filling trace
...
|
|
Merge more updates from Andrew Morton:
- the rest of MM
- lib/bitmap updates
- hfs updates
- fatfs updates
- various other misc things
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
mm/gup.c: fix __get_user_pages_fast() comment
mm: Fix warning in insert_pfn()
memory-hotplug.rst: add some details about locking internals
powerpc/powernv: hold device_hotplug_lock when calling memtrace_offline_pages()
powerpc/powernv: hold device_hotplug_lock when calling device_online()
mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
mm/memory_hotplug: make add_memory() take the device_hotplug_lock
mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
mm/memblock.c: warn if zero alignment was requested
memblock: stop using implicit alignment to SMP_CACHE_BYTES
docs/boot-time-mm: remove bootmem documentation
mm: remove include/linux/bootmem.h
memblock: replace BOOTMEM_ALLOC_* with MEMBLOCK variants
mm: remove nobootmem
memblock: rename __free_pages_bootmem to memblock_free_pages
memblock: rename free_all_bootmem to memblock_free_all
memblock: replace free_bootmem_late with memblock_free_late
memblock: replace free_bootmem{_node} with memblock_free
mm: nobootmem: remove bootmem allocation APIs
memblock: replace alloc_bootmem with memblock_alloc
...
|
|
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.
The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>
@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>
[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
All architecures use memblock for early memory management. There is no need
for the CONFIG_HAVE_MEMBLOCK configuration option.
[rppt@linux.vnet.ibm.com: of/fdt: fixup #ifdefs]
Link: http://lkml.kernel.org/r/20180919103457.GA20545@rapoport-lnx
[rppt@linux.vnet.ibm.com: csky: fixups after bootmem removal]
Link: http://lkml.kernel.org/r/20180926112744.GC4628@rapoport-lnx
[rppt@linux.vnet.ibm.com: remove stale #else and the code it protects]
Link: http://lkml.kernel.org/r/1538067825-24835-1-git-send-email-rppt@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1536927045-23536-4-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
setattr_copy can't truncate timestamps correctly for
msdos/vfat, so truncate and copy them ourselves.
Link: http://lkml.kernel.org/r/a2b4701b1125573fafaeaae6802050ca86d6f8cc.1538363961.git.sorenson@redhat.com
Signed-off-by: Frank Sorenson <sorenson@redhat.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Convert the inode timestamp updates to use fat_truncate_time.
Link: http://lkml.kernel.org/r/2663d3083c4dd62f00b64612c8eaf5542bb05a4c.1538363961.git.sorenson@redhat.com
Signed-off-by: Frank Sorenson <sorenson@redhat.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|