Age | Commit message (Collapse) | Author | Files | Lines |
|
Fix another "confused deputy" weakness[1]. Writes to /proc/$pid/attr/
files need to check the opener credentials, since these fds do not
transition state across execve(). Without this, it is possible to
trick another process (which may have different credentials) to write
to its own /proc/$pid/attr/ files, leading to unexpected and possibly
exploitable behaviors.
[1] https://www.kernel.org/doc/html/latest/security/credentials.html?highlight=confused#open-file-credentials
Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull netfs fixes from David Howells:
"A couple of fixes to the new netfs lib:
- Pass the AOP flags through from netfs_write_begin() into
grab_cache_page_write_begin().
- Automatically enable in Kconfig netfs lib rather than presenting an
option for manual enablement"
* tag 'netfs-lib-fixes-20200525' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
netfs: Make CONFIG_NETFS_SUPPORT auto-selected rather than manual
netfs: Pass flags through to grab_cache_page_write_begin()
|
|
In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
warnings by explicitly adding multiple fallthrough pseudo-keywords in
places where the code is intended to fall through to the next case.
Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: linux-afs@lists.infradead.org
cc: linux-hardening@vger.kernel.org
Link: https://lore.kernel.org/r/51150b54e0b0431a2c401cd54f2c4e7f50e94601.1605896059.git.gustavoars@kernel.org/ # v1
Link: https://lore.kernel.org/r/20210420211615.GA51432@embeddedor/ # v2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Make the netfs helper library selected automatically by the things that use
it rather than being manually configured, even though it's required[1].
Fixes: 3a5829fefd3b ("netfs: Make a netfs helper module")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-mm@kvack.org
cc: linux-cachefs@redhat.com
cc: linux-afs@lists.infradead.org
cc: linux-nfs@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/CAMuHMdXJZ7iNQE964CdBOU=vRKVMFzo=YF_eiwsGgqzuvZ+TuA@mail.gmail.com [1]
Link: https://lore.kernel.org/r/162090298141.3166007.2971118149366779916.stgit@warthog.procyon.org.uk # v1
|
|
In netfs_write_begin(), pass the AOP flags through to
grab_cache_page_write_begin() so that a request to use GFP_NOFS is
honoured.
Fixes: e1b1240c1ff5 ("netfs: Add write_begin helper")
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
cc: linux-mm@kvack.org
cc: linux-cachefs@redhat.com
cc: linux-afs@lists.infradead.org
cc: linux-nfs@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/162090295383.3165945.13595101698295243662.stgit@warthog.procyon.org.uk # v1
|
|
In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific
page flags") the use of PagePrivate to indicate a reservation count
should be restored at free time was changed to the hugetlb specific flag
HPageRestoreReserve. Changes to a userfaultfd error path as well as a
VM_BUG_ON() in remove_inode_hugepages() were overlooked.
Users could see incorrect hugetlb reserve counts if they experience an
error with a UFFDIO_COPY operation. Specifically, this would be the
result of an unlikely copy_huge_page_from_user error. There is not an
increased chance of hitting the VM_BUG_ON.
Link: https://lkml.kernel.org/r/20210521233952.236434-1-mike.kravetz@oracle.com
Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mina Almasry <almasry.mina@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull block fixes from Jens Axboe:
- Fix BLKRRPART and deletion race (Gulam, Christoph)
- NVMe pull request (Christoph):
- nvme-tcp corruption and timeout fixes (Sagi Grimberg, Keith
Busch)
- nvme-fc teardown fix (James Smart)
- nvmet/nvme-loop memory leak fixes (Wu Bo)"
* tag 'block-5.13-2021-05-22' of git://git.kernel.dk/linux-block:
block: fix a race between del_gendisk and BLKRRPART
block: prevent block device lookups at the beginning of del_gendisk
nvme-fc: clear q_live at beginning of association teardown
nvme-tcp: rerun io_work if req_list is not empty
nvme-tcp: fix possible use-after-completion
nvme-loop: fix memory leak in nvme_loop_create_ctrl()
nvmet: fix memory leak in nvmet_alloc_ctrl()
|
|
Pull io_uring fixes from Jens Axboe:
"One fix for a regression with poll in this merge window, and another
just hardens the io-wq exit path a bit"
* tag 'io_uring-5.13-2021-05-22' of git://git.kernel.dk/linux-block:
io_uring: fortify tctx/io_wq cleanup
io_uring: don't modify req->poll for rw
|
|
Pull xfs fixes from Darrick Wong:
- Fix some math errors in the realtime allocator when extent size hints
are applied.
- Fix unnecessary short writes to realtime files when free space is
fragmented.
- Fix a crash when using scrub tracepoints.
- Restore ioctl uapi definitions that were accidentally removed in
5.13-rc1.
* tag 'xfs-5.13-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: restore old ioctl definitions
xfs: fix deadlock retry tracepoint arguments
xfs: retry allocations when locality-based search fails
xfs: adjust rt allocation minlen when extszhint > rtextsize
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more fixes:
- fix unaligned compressed writes in zoned mode
- fix false positive lockdep warning when cloning inline extent
- remove wrong BUG_ON in tree-log error handling"
* tag 'for-5.13-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zoned: fix parallel compressed writes
btrfs: zoned: pass start block to btrfs_use_zone_append
btrfs: do not BUG_ON in link_to_fixup_dir
btrfs: release path before starting transaction when cloning inline extent
|
|
Pull cifs fixes from Steve French:
"Seven smb3 fixes: one for stable, three others fix problems found in
testing handle leases, and a compounded request fix"
* tag '5.13-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6:
Fix KASAN identified use-after-free issue.
Defer close only when lease is enabled.
Fix kernel oops when CONFIG_DEBUG_ATOMIC_SLEEP is enabled.
cifs: Fix inconsistent indenting
cifs: fix memory leak in smb2_copychunk_range
SMB3: incorrect file id in requests compounded with open
cifs: remove deadstore in cifs_close_all_deferred_files()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull siginfo fix from Eric Biederman:
"During the merge window an issue with si_perf and the siginfo ABI came
up. The alpha and sparc siginfo structure layout had changed with the
addition of SIGTRAP TRAP_PERF and the new field si_perf.
The reason only alpha and sparc were affected is that they are the
only architectures that use si_trapno.
Looking deeper it was discovered that si_trapno is used for only a few
select signals on alpha and sparc, and that none of the other
_sigfault fields past si_addr are used at all. Which means technically
no regression on alpha and sparc.
While the alignment concerns might be dismissed the abuse of si_errno
by SIGTRAP TRAP_PERF does have the potential to cause regressions in
existing userspace.
While we still have time before userspace starts using and depending
on the new definition siginfo for SIGTRAP TRAP_PERF this set of
changes cleans up siginfo_t.
- The si_trapno field is demoted from magic alpha and sparc status
and made an ordinary union member of the _sigfault member of
siginfo_t. Without moving it of course.
- si_perf is replaced with si_perf_data and si_perf_type ending the
abuse of si_errno.
- Unnecessary additions to signalfd_siginfo are removed"
* 'for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signalfd: Remove SIL_PERF_EVENT fields from signalfd_siginfo
signal: Deliver all of the siginfo perf data in _perf
signal: Factor force_sig_perf out of perf_sigtrap
signal: Implement SIL_FAULT_TRAPNO
siginfo: Move si_trapno inside the union inside _si_fault
|
|
[ 612.157429] ==================================================================
[ 612.158275] BUG: KASAN: use-after-free in process_one_work+0x90/0x9b0
[ 612.158801] Read of size 8 at addr ffff88810a31ca60 by task kworker/2:9/2382
[ 612.159611] CPU: 2 PID: 2382 Comm: kworker/2:9 Tainted: G
OE 5.13.0-rc2+ #98
[ 612.159623] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.14.0-1.fc33 04/01/2014
[ 612.159640] Workqueue: 0x0 (deferredclose)
[ 612.159669] Call Trace:
[ 612.159685] dump_stack+0xbb/0x107
[ 612.159711] print_address_description.constprop.0+0x18/0x140
[ 612.159733] ? process_one_work+0x90/0x9b0
[ 612.159743] ? process_one_work+0x90/0x9b0
[ 612.159754] kasan_report.cold+0x7c/0xd8
[ 612.159778] ? lock_is_held_type+0x80/0x130
[ 612.159789] ? process_one_work+0x90/0x9b0
[ 612.159812] kasan_check_range+0x145/0x1a0
[ 612.159834] process_one_work+0x90/0x9b0
[ 612.159877] ? pwq_dec_nr_in_flight+0x110/0x110
[ 612.159914] ? spin_bug+0x90/0x90
[ 612.159967] worker_thread+0x3b6/0x6c0
[ 612.160023] ? process_one_work+0x9b0/0x9b0
[ 612.160038] kthread+0x1dc/0x200
[ 612.160051] ? kthread_create_worker_on_cpu+0xd0/0xd0
[ 612.160092] ret_from_fork+0x1f/0x30
[ 612.160399] Allocated by task 2358:
[ 612.160757] kasan_save_stack+0x1b/0x40
[ 612.160768] __kasan_kmalloc+0x9b/0xd0
[ 612.160778] cifs_new_fileinfo+0xb0/0x960 [cifs]
[ 612.161170] cifs_open+0xadf/0xf20 [cifs]
[ 612.161421] do_dentry_open+0x2aa/0x6b0
[ 612.161432] path_openat+0xbd9/0xfa0
[ 612.161441] do_filp_open+0x11d/0x230
[ 612.161450] do_sys_openat2+0x115/0x240
[ 612.161460] __x64_sys_openat+0xce/0x140
When mod_delayed_work is called to modify the delay of pending work,
it might return false and queue a new work when pending work is
already scheduled or when try to grab pending work failed.
So, Increase the reference count when new work is scheduled to
avoid use-after-free.
Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here is a big set of char/misc/other driver fixes for 5.13-rc3.
The majority here is the fallout of the umn.edu re-review of all prior
submissions. That resulted in a bunch of reverts along with the
"correct" changes made, such that there is no regression of any of the
potential fixes that were made by those individuals. I would like to
thank the over 80 different developers who helped with the review and
fixes for this mess.
Other than that, there's a few habanna driver fixes for reported
issues, and some dyndbg fixes for reported problems.
All of these have been in linux-next for a while with no reported
problems"
* tag 'char-misc-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (82 commits)
misc: eeprom: at24: check suspend status before disable regulator
uio_hv_generic: Fix another memory leak in error handling paths
uio_hv_generic: Fix a memory leak in error handling paths
uio/uio_pci_generic: fix return value changed in refactoring
Revert "Revert "ALSA: usx2y: Fix potential NULL pointer dereference""
dyndbg: drop uninformative vpr_info
dyndbg: avoid calling dyndbg_emit_prefix when it has no work
binder: Return EFAULT if we fail BINDER_ENABLE_ONEWAY_SPAM_DETECTION
cdrom: gdrom: initialize global variable at init time
brcmfmac: properly check for bus register errors
Revert "brcmfmac: add a check for the status of usb_register"
video: imsttfb: check for ioremap() failures
Revert "video: imsttfb: fix potential NULL pointer dereferences"
net: liquidio: Add missing null pointer checks
Revert "net: liquidio: fix a NULL pointer dereference"
media: gspca: properly check for errors in po1030_probe()
Revert "media: gspca: Check the return value of write_bridge for timeout"
media: gspca: mt9m111: Check write_bridge for timeout
Revert "media: gspca: mt9m111: Check write_bridge for timeout"
media: dvb: Add check on sp8870_readreg return
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota fixes from Jan Kara:
"The most important part in the pull is disablement of the new syscall
quotactl_path() which was added in rc1.
The reason is some people at LWN discussion pointed out dirfd would be
useful for this path based syscall and Christian Brauner agreed.
Without dirfd it may be indeed problematic for containers. So let's
just disable the syscall for now when it doesn't have users yet so
that we have more time to mull over how to best specify the filesystem
we want to work on"
* tag 'quota_for_v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: Disable quotactl_path syscall
quota: Use 'hlist_for_each_entry' to simplify code
|
|
These ioctl definitions in xfs_fs.h are part of the userspace ABI and
were mistakenly removed during the 5.13 merge window.
Fixes: 9fefd5db08ce ("xfs: convert to fileattr")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
sc->ip is the inode that's being scrubbed, which means that it's not set
for scrub types that don't involve inodes. If one of those scrubbers
(e.g. inode btrees) returns EDEADLOCK, we'll trip over the null pointer.
Fix that by reporting either the file being examined or the file that
was used to call scrub.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
If a realtime allocation fails because we can't find a sufficiently
large free extent satisfying locality rules, relax the locality rules
and try again. This reduces the occurrence of short writes to realtime
files when the write size is large and the free space is fragmented.
This was originally discovered by running generic/186 with the realtime
reflink patchset and a 128k cow extent size hint, but the short write
symptoms can manifest with a 128k extent size hint and no reflink, so
apply the fix now.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
|
|
When BLKRRPART is called concurrently with del_gendisk, the partitions
rescan can create a stale partition that will never be be cleaned up.
Fix this by checking the the disk is up before rescanning partitions
while under bd_mutex.
Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210514131842.1600568-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
As an artifact of how gendisk lookup used to work in earlier kernels,
GENHD_FL_UP is only cleared very late in del_gendisk, and a global lock
is used to prevent opens from succeeding while del_gendisk is tearing
down the gendisk. Switch to clearing the flag early and under bd_mutex
so that callers can use bd_mutex to stabilize the flag, which removes
the need for the global mutex.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210514131842.1600568-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When multiple processes write data to the same block group on a
compressed zoned filesystem, the underlying device could report I/O
errors and data corruption is possible.
This happens because on a zoned file system, compressed data writes
where sent to the device via a REQ_OP_WRITE instead of a
REQ_OP_ZONE_APPEND operation. But with REQ_OP_WRITE and parallel
submission it cannot be guaranteed that the data is always submitted
aligned to the underlying zone's write pointer.
The change to using REQ_OP_ZONE_APPEND instead of REQ_OP_WRITE on a
zoned filesystem is non intrusive on a regular file system or when
submitting to a conventional zone on a zoned filesystem, as it is
guarded by btrfs_use_zone_append.
Reported-by: David Sterba <dsterba@suse.com>
Fixes: 9d294a685fbc ("btrfs: zoned: enable to mount ZONED incompat flag")
CC: stable@vger.kernel.org # 5.12.x: e380adfc213a13: btrfs: zoned: pass start block to btrfs_use_zone_append
CC: stable@vger.kernel.org # 5.12.x
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_use_zone_append only needs the passed in extent_map's block_start
member, so there's no need to pass in the full extent map.
This also enables the use of btrfs_use_zone_append in places where we only
have a start byte but no extent_map.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We don't want anyone poking into tctx->io_wq awhile it's being destroyed
by io_wq_put_and_exit(), and even though it shouldn't even happen, if
buggy would be preferable to get a NULL-deref instead of subtle delayed
failure or UAF.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/827b021de17926fd807610b3e53a5a5fa8530856.1621513214.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When smb2 lease parameter is disabled on server. Server grants
batch oplock instead of RHW lease by default on open, inode page cache
needs to be zapped immediatley upon close as cache is not valid.
Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Removed oplock_break_received flag which was added to achieve
synchronization between oplock handler and open handler by earlier commit.
It is not needed because there is an existing lock open_file_lock to achieve
the same. find_readable_file takes open_file_lock and then traverses the
openFileList. Similarly, cifs_oplock_break while closing the deferred
handle (i.e cifsFileInfo_put) takes open_file_lock and then sends close
to the server.
Added comments for better readability.
Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Eliminate the follow smatch warning:
fs/cifs/fs_context.c:1148 smb3_fs_context_parse_param() warn:
inconsistent indenting.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When using smb2_copychunk_range() for large ranges we will
run through several iterations of a loop calling SMB2_ioctl()
but never actually free the returned buffer except for the final
iteration.
This leads to memory leaks everytime a large copychunk is requested.
Fixes: 9bf0c9cd4314 ("CIFS: Fix SMB2/SMB3 Copy offload support (refcopy) for large files")
Cc: <stable@vger.kernel.org>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux
Pull mount_setattr fix from Christian Brauner:
"This makes an underlying idmapping assumption more explicit.
We currently don't have any filesystems that support idmapped mounts
which are mountable inside a user namespace, i.e. where s_user_ns !=
init_user_ns. That was a deliberate decision for now as userns root
can just mount the filesystem themselves.
Express this restriction explicitly and enforce it until there's a
real use-case for this. This way we can notice it and will have a
chance to adapt and audit our translation helpers and fstests
appropriately if we need to support such filesystems"
* tag 'fs.idmapped.mount_setattr.v5.13-rc3' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux:
fs/mount_setattr: tighten permission checks
|
|
See MS-SMB2 3.2.4.1.4, file ids in compounded requests should be set to
0xFFFFFFFFFFFFFFFF (we were treating it as u32 not u64 and setting
it incorrectly).
Signed-off-by: Steve French <stfrench@microsoft.com>
Reported-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
|
|
With the addition of ssi_perf_data and ssi_perf_type struct signalfd_siginfo
is dangerously close to running out of space. All that remains is just
enough space for two additional 64bit fields. A practice of adding all
possible siginfo_t fields into struct singalfd_siginfo can not be supported
as adding the missing fields ssi_lower, ssi_upper, and ssi_pkey would
require two 64bit fields and one 32bit fields. In practice the fields
ssi_perf_data and ssi_perf_type can never be used by signalfd as the signal
that generates them always delivers them synchronously to the thread that
triggers them.
Therefore until someone actually needs the fields ssi_perf_data and
ssi_perf_type in signalfd_siginfo remove them. This leaves a bit more room
for future expansion.
v1: https://lkml.kernel.org/r/20210503203814.25487-12-ebiederm@xmission.com
v2: https://lkml.kernel.org/r/20210505141101.11519-12-ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20210517195748.8880-5-ebiederm@xmission.com
Reviewed-by: Marco Elver <elver@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Don't abuse si_errno and deliver all of the perf data in _perf member
of siginfo_t.
Note: The data field in the perf data structures in a u64 to allow a
pointer to be encoded without needed to implement a 32bit and 64bit
version of the same structure. There already exists a 32bit and 64bit
versions siginfo_t, and the 32bit version can not include a 64bit
member as it only has 32bit alignment. So unsigned long is used in
siginfo_t instead of a u64 as unsigned long can encode a pointer on
all architectures linux supports.
v1: https://lkml.kernel.org/r/m11rarqqx2.fsf_-_@fess.ebiederm.org
v2: https://lkml.kernel.org/r/20210503203814.25487-10-ebiederm@xmission.com
v3: https://lkml.kernel.org/r/20210505141101.11519-11-ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20210517195748.8880-4-ebiederm@xmission.com
Reviewed-by: Marco Elver <elver@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Now that si_trapno is part of the union in _si_fault and available on
all architectures, add SIL_FAULT_TRAPNO and update siginfo_layout to
return SIL_FAULT_TRAPNO when the code assumes si_trapno is valid.
There is room for future changes to reduce when si_trapno is valid but
this is all that is needed to make si_trapno and the other members of
the the union in _sigfault mutually exclusive.
Update the code that uses siginfo_layout to deal with SIL_FAULT_TRAPNO
and have the same code ignore si_trapno in in all other cases.
v1: https://lkml.kernel.org/r/m1o8dvs7s7.fsf_-_@fess.ebiederm.org
v2: https://lkml.kernel.org/r/20210505141101.11519-6-ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20210517195748.8880-2-ebiederm@xmission.com
Reviewed-by: Marco Elver <elver@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more fixes:
- fix fiemap to print extents that could get misreported due to
internal extent splitting and logical merging for fiemap output
- fix RCU stalls during delayed iputs
- fix removed dentries still existing after log is synced"
* tag 'for-5.13-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix removed dentries still existing after log is synced
btrfs: return whole extents in fiemap
btrfs: avoid RCU stalls while running delayed iputs
btrfs: return 0 for dev_extent_hole_check_zoned hole_start in case of error
|
|
While doing error injection testing I got the following panic
kernel BUG at fs/btrfs/tree-log.c:1862!
invalid opcode: 0000 [#1] SMP NOPTI
CPU: 1 PID: 7836 Comm: mount Not tainted 5.13.0-rc1+ #305
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
RIP: 0010:link_to_fixup_dir+0xd5/0xe0
RSP: 0018:ffffb5800180fa30 EFLAGS: 00010216
RAX: fffffffffffffffb RBX: 00000000fffffffb RCX: ffff8f595287faf0
RDX: ffffb5800180fa37 RSI: ffff8f5954978800 RDI: 0000000000000000
RBP: ffff8f5953af9450 R08: 0000000000000019 R09: 0000000000000001
R10: 000151f408682970 R11: 0000000120021001 R12: ffff8f5954978800
R13: ffff8f595287faf0 R14: ffff8f5953c77dd0 R15: 0000000000000065
FS: 00007fc5284c8c40(0000) GS:ffff8f59bbd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc5287f47c0 CR3: 000000011275e002 CR4: 0000000000370ee0
Call Trace:
replay_one_buffer+0x409/0x470
? btree_read_extent_buffer_pages+0xd0/0x110
walk_up_log_tree+0x157/0x1e0
walk_log_tree+0xa6/0x1d0
btrfs_recover_log_trees+0x1da/0x360
? replay_one_extent+0x7b0/0x7b0
open_ctree+0x1486/0x1720
btrfs_mount_root.cold+0x12/0xea
? __kmalloc_track_caller+0x12f/0x240
legacy_get_tree+0x24/0x40
vfs_get_tree+0x22/0xb0
vfs_kern_mount.part.0+0x71/0xb0
btrfs_mount+0x10d/0x380
? vfs_parse_fs_string+0x4d/0x90
legacy_get_tree+0x24/0x40
vfs_get_tree+0x22/0xb0
path_mount+0x433/0xa10
__x64_sys_mount+0xe3/0x120
do_syscall_64+0x3d/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
We can get -EIO or any number of legitimate errors from
btrfs_search_slot(), panicing here is not the appropriate response. The
error path for this code handles errors properly, simply return the
error.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When cloning an inline extent there are a few cases, such as when we have
an implicit hole at file offset 0, where we start a transaction while
holding a read lock on a leaf. Starting the transaction results in a call
to sb_start_intwrite(), which results in doing a read lock on a percpu
semaphore. Lockdep doesn't like this and complains about it:
[46.580704] ======================================================
[46.580752] WARNING: possible circular locking dependency detected
[46.580799] 5.13.0-rc1 #28 Not tainted
[46.580832] ------------------------------------------------------
[46.580877] cloner/3835 is trying to acquire lock:
[46.580918] c00000001301d638 (sb_internal#2){.+.+}-{0:0}, at: clone_copy_inline_extent+0xe4/0x5a0
[46.581167]
[46.581167] but task is already holding lock:
[46.581217] c000000007fa2550 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x70/0x1d0
[46.581293]
[46.581293] which lock already depends on the new lock.
[46.581293]
[46.581351]
[46.581351] the existing dependency chain (in reverse order) is:
[46.581410]
[46.581410] -> #1 (btrfs-tree-00){++++}-{3:3}:
[46.581464] down_read_nested+0x68/0x200
[46.581536] __btrfs_tree_read_lock+0x70/0x1d0
[46.581577] btrfs_read_lock_root_node+0x88/0x200
[46.581623] btrfs_search_slot+0x298/0xb70
[46.581665] btrfs_set_inode_index+0xfc/0x260
[46.581708] btrfs_new_inode+0x26c/0x950
[46.581749] btrfs_create+0xf4/0x2b0
[46.581782] lookup_open.isra.57+0x55c/0x6a0
[46.581855] path_openat+0x418/0xd20
[46.581888] do_filp_open+0x9c/0x130
[46.581920] do_sys_openat2+0x2ec/0x430
[46.581961] do_sys_open+0x90/0xc0
[46.581993] system_call_exception+0x3d4/0x410
[46.582037] system_call_common+0xec/0x278
[46.582078]
[46.582078] -> #0 (sb_internal#2){.+.+}-{0:0}:
[46.582135] __lock_acquire+0x1e90/0x2c50
[46.582176] lock_acquire+0x2b4/0x5b0
[46.582263] start_transaction+0x3cc/0x950
[46.582308] clone_copy_inline_extent+0xe4/0x5a0
[46.582353] btrfs_clone+0x5fc/0x880
[46.582388] btrfs_clone_files+0xd8/0x1c0
[46.582434] btrfs_remap_file_range+0x3d8/0x590
[46.582481] do_clone_file_range+0x10c/0x270
[46.582558] vfs_clone_file_range+0x1b0/0x310
[46.582605] ioctl_file_clone+0x90/0x130
[46.582651] do_vfs_ioctl+0x874/0x1ac0
[46.582697] sys_ioctl+0x6c/0x120
[46.582733] system_call_exception+0x3d4/0x410
[46.582777] system_call_common+0xec/0x278
[46.582822]
[46.582822] other info that might help us debug this:
[46.582822]
[46.582888] Possible unsafe locking scenario:
[46.582888]
[46.582942] CPU0 CPU1
[46.582984] ---- ----
[46.583028] lock(btrfs-tree-00);
[46.583062] lock(sb_internal#2);
[46.583119] lock(btrfs-tree-00);
[46.583174] lock(sb_internal#2);
[46.583212]
[46.583212] *** DEADLOCK ***
[46.583212]
[46.583266] 6 locks held by cloner/3835:
[46.583299] #0: c00000001301d448 (sb_writers#12){.+.+}-{0:0}, at: ioctl_file_clone+0x90/0x130
[46.583382] #1: c00000000f6d3768 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}, at: lock_two_nondirectories+0x58/0xc0
[46.583477] #2: c00000000f6d72a8 (&sb->s_type->i_mutex_key#15/4){+.+.}-{3:3}, at: lock_two_nondirectories+0x9c/0xc0
[46.583574] #3: c00000000f6d7138 (&ei->i_mmap_lock){+.+.}-{3:3}, at: btrfs_remap_file_range+0xd0/0x590
[46.583657] #4: c00000000f6d35f8 (&ei->i_mmap_lock/1){+.+.}-{3:3}, at: btrfs_remap_file_range+0xe0/0x590
[46.583743] #5: c000000007fa2550 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x70/0x1d0
[46.583828]
[46.583828] stack backtrace:
[46.583872] CPU: 1 PID: 3835 Comm: cloner Not tainted 5.13.0-rc1 #28
[46.583931] Call Trace:
[46.583955] [c0000000167c7200] [c000000000c1ee78] dump_stack+0xec/0x144 (unreliable)
[46.584052] [c0000000167c7240] [c000000000274058] print_circular_bug.isra.32+0x3a8/0x400
[46.584123] [c0000000167c72e0] [c0000000002741f4] check_noncircular+0x144/0x190
[46.584191] [c0000000167c73b0] [c000000000278fc0] __lock_acquire+0x1e90/0x2c50
[46.584259] [c0000000167c74f0] [c00000000027aa94] lock_acquire+0x2b4/0x5b0
[46.584317] [c0000000167c75e0] [c000000000a0d6cc] start_transaction+0x3cc/0x950
[46.584388] [c0000000167c7690] [c000000000af47a4] clone_copy_inline_extent+0xe4/0x5a0
[46.584457] [c0000000167c77c0] [c000000000af525c] btrfs_clone+0x5fc/0x880
[46.584514] [c0000000167c7990] [c000000000af5698] btrfs_clone_files+0xd8/0x1c0
[46.584583] [c0000000167c7a00] [c000000000af5b58] btrfs_remap_file_range+0x3d8/0x590
[46.584652] [c0000000167c7ae0] [c0000000005d81dc] do_clone_file_range+0x10c/0x270
[46.584722] [c0000000167c7b40] [c0000000005d84f0] vfs_clone_file_range+0x1b0/0x310
[46.584793] [c0000000167c7bb0] [c00000000058bf80] ioctl_file_clone+0x90/0x130
[46.584861] [c0000000167c7c10] [c00000000058c894] do_vfs_ioctl+0x874/0x1ac0
[46.584922] [c0000000167c7d10] [c00000000058db4c] sys_ioctl+0x6c/0x120
[46.584978] [c0000000167c7d60] [c0000000000364a4] system_call_exception+0x3d4/0x410
[46.585046] [c0000000167c7e10] [c00000000000d45c] system_call_common+0xec/0x278
[46.585114] --- interrupt: c00 at 0x7ffff7e22990
[46.585160] NIP: 00007ffff7e22990 LR: 00000001000010ec CTR: 0000000000000000
[46.585224] REGS: c0000000167c7e80 TRAP: 0c00 Not tainted (5.13.0-rc1)
[46.585280] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 28000244 XER: 00000000
[46.585374] IRQMASK: 0
[46.585374] GPR00: 0000000000000036 00007fffffffdec0 00007ffff7f17100 0000000000000004
[46.585374] GPR04: 000000008020940d 00007fffffffdf40 0000000000000000 0000000000000000
[46.585374] GPR08: 0000000000000004 0000000000000000 0000000000000000 0000000000000000
[46.585374] GPR12: 0000000000000000 00007ffff7ffa940 0000000000000000 0000000000000000
[46.585374] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[46.585374] GPR20: 0000000000000000 000000009123683e 00007fffffffdf40 0000000000000000
[46.585374] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000004
[46.585374] GPR28: 0000000100030260 0000000100030280 0000000000000003 000000000000005f
[46.585919] NIP [00007ffff7e22990] 0x7ffff7e22990
[46.585964] LR [00000001000010ec] 0x1000010ec
[46.586010] --- interrupt: c00
This should be a false positive, as both locks are acquired in read mode.
Nevertheless, we don't need to hold a leaf locked when we start the
transaction, so just release the leaf (path) before starting it.
Reported-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/linux-btrfs/20210513214404.xks77p566fglzgum@riteshh-domain/
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
__io_queue_proc() is used by both poll and apoll, so we should not
access req->poll directly but selecting right struct io_poll_iocb
depending on use case.
Reported-and-tested-by: syzbot+a84b8783366ecb1c65d0@syzkaller.appspotmail.com
Fixes: ea6a693d862d ("io_uring: disable multishot poll for double poll add cases")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4a6a1de31142d8e0250fe2dfd4c8923d82a5bbfc.1621251795.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Deadstore detected by Lukas Bulwahn's CodeChecker Tool (ELISA group).
line 741 struct cifsInodeInfo *cinode;
line 747 cinode = CIFS_I(d_inode(cfile->dentry));
could be deleted.
cinode on filesystem should not be deleted when files are closed,
they are representations of some data fields on a physical disk,
thus no further action is required.
The virtual inode on vfs will be handled by vfs automatically,
and the denotation is inode, which is different from the cinode.
Signed-off-by: wenhuizhang <wenhui@gwmail.gwu.edu>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
xfs_bmap_rtalloc doesn't handle realtime extent files with extent size
hints larger than the rt volume's extent size properly, because
xfs_bmap_extsize_align can adjust the offset/length parameters to try to
fit the extent size hint.
Under these conditions, minlen has to be large enough so that any
allocation returned by xfs_rtallocate_extent will be large enough to
cover at least one of the blocks that the caller asked for. If the
allocation is too short, bmapi_write will return no mapping for the
requested range, which causes ENOSPC errors in other parts of the
filesystem.
Therefore, adjust minlen upwards to fix this. This can be found by
running generic/263 (g/127 or g/522) with a realtime extent size hint
that's larger than the rt volume extent size.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
|
|
Merge misc fixes from Andrew Morton:
"13 patches.
Subsystems affected by this patch series: resource, squashfs, hfsplus,
modprobe, and mm (hugetlb, slub, userfaultfd, ksm, pagealloc, kasan,
pagemap, and ioremap)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/ioremap: fix iomap_max_page_shift
docs: admin-guide: update description for kernel.modprobe sysctl
hfsplus: prevent corruption in shrinking truncate
mm/filemap: fix readahead return types
kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled
mm: fix struct page layout on 32-bit systems
ksm: revert "use GET_KSM_PAGE_NOLOCK to get ksm page in remove_rmap_item_from_tree()"
userfaultfd: release page in error path to avoid BUG_ON
squashfs: fix divide error in calculate_skip()
kernel/resource: fix return code check in __request_free_mem_region
mm, slub: move slub_debug static key enabling outside slab_mutex
mm/hugetlb: fix cow where page writtable in child
mm/hugetlb: fix F_SEAL_FUTURE_WRITE
|
|
Pull io_uring fixes from Jens Axboe:
"Just a few minor fixes/changes:
- Fix issue with double free race for linked timeout completions
- Fix reference issue with timeouts
- Remove last few places that make SQPOLL special, since it's just an
io thread now.
- Bump maximum allowed registered buffers, as we don't allocate as
much anymore"
* tag 'io_uring-5.13-2021-05-14' of git://git.kernel.dk/linux-block:
io_uring: increase max number of reg buffers
io_uring: further remove sqpoll limits on opcodes
io_uring: fix ltout double free on completion race
io_uring: fix link timeout refs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:
"This mainly fixes 1 lcluster-sized pclusters for the big pcluster
feature, which can be forcely generated by mkfs as a specific on-disk
case for per-(sub)file compression strategies but missed to handle in
runtime properly.
Also, documentation updates are included to fix the broken
illustration due to the ReST conversion by accident and complete the
big pcluster introduction.
Summary:
- update documentation to fix the broken illustration due to ReST
conversion by accident at that time and complete the big pcluster
introduction
- fix 1 lcluster-sized pclusters for the big pcluster feature"
* tag 'erofs-for-5.13-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: fix 1 lcluster-sized pcluster for big pcluster
erofs: update documentation about data compression
erofs: fix broken illustration in documentation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull dax fixes from Dan Williams:
"A fix for a hang condition due to missed wakeups in the filesystem-dax
core when exercised by virtiofs.
This bug has been there from the beginning, but the condition has
not triggered on other filesystems since they hold a lock over
invalidation events"
* tag 'dax-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax: Wake up all waiters after invalidating dax entry
dax: Add a wakeup mode parameter to put_unlocked_entry()
dax: Add an enum for specifying dax wakup mode
|
|
I believe there are some issues introduced by commit 31651c607151
("hfsplus: avoid deadlock on file truncation")
HFS+ has extent records which always contains 8 extents. In case the
first extent record in catalog file gets full, new ones are allocated from
extents overflow file.
In case shrinking truncate happens to middle of an extent record which
locates in extents overflow file, the logic in hfsplus_file_truncate() was
changed so that call to hfs_brec_remove() is not guarded any more.
Right action would be just freeing the extents that exceed the new size
inside extent record by calling hfsplus_free_extents(), and then check if
the whole extent record should be removed. However since the guard
(blk_cnt > start) is now after the call to hfs_brec_remove(), this has
unfortunate effect that the last matching extent record is removed
unconditionally.
To reproduce this issue, create a file which has at least 10 extents, and
then perform shrinking truncate into middle of the last extent record, so
that the number of remaining extents is not under or divisible by 8. This
causes the last extent record (8 extents) to be removed totally instead of
truncating into middle of it. Thus this causes corruption, and lost data.
Fix for this is simply checking if the new truncated end is below the
start of this extent record, making it safe to remove the full extent
record. However call to hfs_brec_remove() can't be moved to it's previous
place since we're dropping ->tree_lock and it can cause a race condition
and the cached info being invalidated possibly corrupting the node data.
Another issue is related to this one. When entering into the block
(blk_cnt > start) we are not holding the ->tree_lock. We break out from
the loop not holding the lock, but hfs_find_exit() does unlock it. Not
sure if it's possible for someone else to take the lock under our feet,
but it can cause hard to debug errors and premature unlocking. Even if
there's no real risk of it, the locking should still always be kept in
balance. Thus taking the lock now just before the check.
Link: https://lkml.kernel.org/r/20210429165139.3082828-1-jouni.roivas@tuxera.com
Fixes: 31651c607151f ("hfsplus: avoid deadlock on file truncation")
Signed-off-by: Jouni Roivas <jouni.roivas@tuxera.com>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Cc: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A readahead request will not allocate more memory than can be represented
by a size_t, even on systems that have HIGHMEM available. Change the
length functions from returning an loff_t to a size_t.
Link: https://lkml.kernel.org/r/20210510201201.1558972-1-willy@infradead.org
Fixes: 32c0a6bcaa1f57 ("btrfs: add and use readahead_batch_length")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Sysbot has reported a "divide error" which has been identified as being
caused by a corrupted file_size value within the file inode. This value
has been corrupted to a much larger value than expected.
Calculate_skip() is passed i_size_read(inode) >> msblk->block_log. Due to
the file_size value corruption this overflows the int argument/variable in
that function, leading to the divide error.
This patch changes the function to use u64. This will accommodate any
unexpectedly large values due to corruption.
The value returned from calculate_skip() is clamped to be never more than
SQUASHFS_CACHED_BLKS - 1, or 7. So file_size corruption does not lead to
an unexpectedly large return result here.
Link: https://lkml.kernel.org/r/20210507152618.9447-1-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reported-by: <syzbot+e8f781243ce16ac2f962@syzkaller.appspotmail.com>
Reported-by: <syzbot+7b98870d4fec9447b951@syzkaller.appspotmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch series "mm/hugetlb: Fix issues on file sealing and fork", v2.
Hugh reported issue with F_SEAL_FUTURE_WRITE not applied correctly to
hugetlbfs, which I can easily verify using the memfd_test program, which
seems that the program is hardly run with hugetlbfs pages (as by default
shmem).
Meanwhile I found another probably even more severe issue on that hugetlb
fork won't wr-protect child cow pages, so child can potentially write to
parent private pages. Patch 2 addresses that.
After this series applied, "memfd_test hugetlbfs" should start to pass.
This patch (of 2):
F_SEAL_FUTURE_WRITE is missing for hugetlb starting from the first day.
There is a test program for that and it fails constantly.
$ ./memfd_test hugetlbfs
memfd-hugetlb: CREATE
memfd-hugetlb: BASIC
memfd-hugetlb: SEAL-WRITE
memfd-hugetlb: SEAL-FUTURE-WRITE
mmap() didn't fail as expected
Aborted (core dumped)
I think it's probably because no one is really running the hugetlbfs test.
Fix it by checking FUTURE_WRITE also in hugetlbfs_file_mmap() as what we
do in shmem_mmap(). Generalize a helper for that.
Link: https://lkml.kernel.org/r/20210503234356.9097-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20210503234356.9097-2-peterx@redhat.com
Fixes: ab3948f58ff84 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs fixes from Jaegeuk Kim:
"This fixes some critical bugs such as memory leak in compression
flows, kernel panic when handling errors, and swapon failure due to
newly added condition check"
* tag 'f2fs-5.13-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
f2fs: return EINVAL for hole cases in swap file
f2fs: avoid swapon failure by giving a warning first
f2fs: compress: fix to assign cc.cluster_idx correctly
f2fs: compress: fix race condition of overwrite vs truncate
f2fs: compress: fix to free compress page correctly
f2fs: support iflag change given the mask
f2fs: avoid null pointer access when handling IPU error
|
|
Since recent changes instead of storing a large array of struct
io_mapped_ubuf, we store pointers to them, that is 4 times slimmer and
we should not to so worry about restricting max number of registererd
buffer slots, increase the limit 4 times.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d3dee1da37f46da416aa96a16bf9e5094e10584d.1620990371.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There are three types of requests that left disabled for sqpoll, namely
epoll ctx, statx, and resources update. Since SQPOLL task is now closely
mimics a userspace thread, remove the restrictions.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/909b52d70c45636d8d7897582474ea5aab5eed34.1620990306.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Always remove linked timeout on io_link_timeout_fn() from the master
request link list, otherwise we may get use-after-free when first
io_link_timeout_fn() puts linked timeout in the fail path, and then
will be found and put on master's free.
Cc: stable@vger.kernel.org # 5.10+
Fixes: 90cd7e424969d ("io_uring: track link timeout's master explicitly")
Reported-and-tested-by: syzbot+5a864149dd970b546223@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/69c46bf6ce37fec4fdcd98f0882e18eb07ce693a.1620990121.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|