Age | Commit message (Collapse) | Author | Files | Lines |
|
We have some very odd semantics for reading the command line through
/proc, because we allow people to rewrite their own command line pretty
much at will, and things get positively funky when you extend your
command line past the point that used to be the end of the command line,
and is now in the environment variable area.
But our weird semantics doesn't mean that we should write weird and
complex code to handle them.
So re-write get_mm_cmdline() to be much simpler, and much more explicit
about what it is actually doing and why. And avoid the extra check for
"is there a NUL character at the end of the command line where I expect
one to be", by simply making the NUL character handling be part of the
normal "once you hit the end of the command line, stop at the first NUL
character" logic.
It's quite possible that we should stop the crazy "walk into
environment" entirely, but happily it's not really the usual case.
NOTE! We tried to really simplify and limit our odd cmdline parsing some
time ago, but people complained. See commit c2c0bb44620d ("proc: fix
PAGE_SIZE limit of /proc/$PID/cmdline") for details about why we have
this complexity.
Cc: Tejun Heo <tj@kernel.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is a pure refactoring of the function, preparing for some further
cleanups. The thing was pretty illegible, and the core functionality
still is, but now the core loop is a bit more isolated from the thing
that goes on around it.
This was "inspired" by the confluence of kworker workqueue name cleanups
by Tejun, currently scheduled for 4.18, and commit 7f7ccc2ccc2e ("proc:
do not access cmdline nor environ from file-backed areas").
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull kvm fixes from Paolo Bonzini:
- ARM/ARM64 locking fixes
- x86 fixes: PCID, UMIP, locking
- improved support for recent Windows version that have a 2048 Hz APIC
timer
- rename KVM_HINTS_DEDICATED CPUID bit to KVM_HINTS_REALTIME
- better behaved selftests
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: rename KVM_HINTS_DEDICATED to KVM_HINTS_REALTIME
KVM: arm/arm64: VGIC/ITS save/restore: protect kvm_read_guest() calls
KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock
KVM: arm/arm64: VGIC/ITS: Promote irq_lock() in update_affinity
KVM: arm/arm64: Properly protect VGIC locks from IRQs
KVM: X86: Lower the default timer frequency limit to 200us
KVM: vmx: update sec exec controls for UMIP iff emulating UMIP
kvm: x86: Suppress CR3_PCID_INVD bit only when PCIDs are enabled
KVM: selftests: exit with 0 status code when tests cannot be run
KVM: hyperv: idr_find needs RCU protection
x86: Delay skip of emulated hypercall instruction
KVM: Extend MAX_IRQ_ROUTES to 4096 for all archs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"We have a core fix in the compat code for covering a potential race
(double references), but it's a very minor change.
The rest are all small device-specific quirks, as well as a correction
of the new UAC3 support code"
* tag 'sound-4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Use Class Specific EP for UAC3 devices.
ALSA: hda/realtek - Clevo P950ER ALC1220 Fixup
ALSA: usb: mixer: volume quirk for CM102-A+/102S+
ALSA: hda: Add Lenovo C50 All in one to the power_save blacklist
ALSA: control: fix a redundant-copy issue
|
|
KVM_HINTS_DEDICATED seems to be somewhat confusing:
Guest doesn't really care whether it's the only task running on a host
CPU as long as it's not preempted.
And there are more reasons for Guest to be preempted than host CPU
sharing, for example, with memory overcommit it can get preempted on a
memory access, post copy migration can cause preemption, etc.
Let's call it KVM_HINTS_REALTIME which seems to better
match what guests expect.
Also, the flag most be set on all vCPUs - current guests assume this.
Note so in the documentation.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
- a fix for the vfio ccw translation code
- update an incorrect email address in the MAINTAINERS file
- fix a division by zero oops in the cpum_sf code found by trinity
- two fixes for the error handling of the qdio code
- several spectre related patches to convert all left-over indirect
branches in the kernel to expoline branches
- update defconfigs to avoid warnings due to the netfilter Kconfig
changes
- avoid several compiler warnings in the kexec_file code for s390
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/qdio: don't release memory in qdio_setup_irq()
s390/qdio: fix access to uninitialized qdio_q fields
s390/cpum_sf: ensure sample frequency of perf event attributes is non-zero
s390: use expoline thunks in the BPF JIT
s390: extend expoline to BC instructions
s390: remove indirect branch from do_softirq_own_stack
s390: move spectre sysfs attribute code
s390/kernel: use expoline for indirect branches
s390/ftrace: use expoline for indirect branches
s390/lib: use expoline for indirect branches
s390/crc32-vx: use expoline for indirect branches
s390: move expoline assembler macros to a header
vfio: ccw: fix cleanup if cp_prefetch fails
s390/kexec_file: add declaration of purgatory related globals
s390: update defconfigs
MAINTAINERS: update s390 zcrypt maintainers email address
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux fixes from Paul Moore:
"A small pull request to fix a few regressions in the SELinux/SCTP code
with applications that call bind() with AF_UNSPEC/INADDR_ANY.
The individual commit descriptions have more information, but the
commits themselves should be self explanatory"
* tag 'selinux-pr-20180516' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: correctly handle sa_family cases in selinux_sctp_bind_connect()
selinux: fix address family in bind() and connect() to match address/port
selinux: add AF_UNSPEC and INADDR_ANY checks to selinux_socket_bind()
|
|
proc_pid_cmdline_read() and environ_read() directly access the target
process' VM to retrieve the command line and environment. If this
process remaps these areas onto a file via mmap(), the requesting
process may experience various issues such as extra delays if the
underlying device is slow to respond.
Let's simply refuse to access file-backed areas in these functions.
For this we add a new FOLL_ANON gup flag that is passed to all calls
to access_remote_vm(). The code already takes care of such failures
(including unmapped areas). Accesses via /proc/pid/mem were not
changed though.
This was assigned CVE-2018-1120.
Note for stable backports: the patch may apply to kernels prior to 4.11
but silently miss one location; it must be checked that no call to
access_remote_vm() keeps zero as the last argument.
Reported-by: Qualys Security Advisory <qsa@qualys.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Some of the ftrace internal events use a zero for a data size of a
field event. This is increasingly important for the histogram trigger
work that is being extended.
While auditing trace events, I found that a couple of the xen events
were used as just marking that a function was called, by creating a
static array of size zero. This can play havoc with the tracing
features if these events are used, because a zero size of a static
array is denoted as a special nul terminated dynamic array (this is
what the trace_marker code uses). But since the xen events have no
size, they are not nul terminated, and unexpected results may occur.
As trace events were never intended on being a marker to denote that a
function was hit or not, especially since function tracing and kprobes
can trivially do the same, the best course of action is to simply
remove these events"
* tag 'trace-v4.17-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/x86/xen: Remove zero data size trace events trace_xen_mmu_flush_tlb{_all}
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull memory barrier for from Steven Rostedt:
"The memory barrier usage in updating the random ptr hash for %p in
vsprintf is incorrect.
Instead of adding the read memory barrier into vsprintf() which will
cause a slight degradation to a commonly used function in the kernel
just to solve a very unlikely race condition that can only happen at
boot up, change the code from using a variable branch to a
static_branch.
Not only does this solve the race condition, it actually will improve
the performance of vsprintf() by removing the conditional branch that
is only needed at boot"
* tag 'trace-v4.17-rc5-vsprintf' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
vsprintf: Replace memory barrier with static_key for random_ptr_key update
|
|
Reviewing Tobin's patches for getting pointers out early before
entropy has been established, I noticed that there's a lone smp_mb() in
the code. As with most lone memory barriers, this one appears to be
incorrectly used.
We currently basically have this:
get_random_bytes(&ptr_key, sizeof(ptr_key));
/*
* have_filled_random_ptr_key==true is dependent on get_random_bytes().
* ptr_to_id() needs to see have_filled_random_ptr_key==true
* after get_random_bytes() returns.
*/
smp_mb();
WRITE_ONCE(have_filled_random_ptr_key, true);
And later we have:
if (unlikely(!have_filled_random_ptr_key))
return string(buf, end, "(ptrval)", spec);
/* Missing memory barrier here. */
hashval = (unsigned long)siphash_1u64((u64)ptr, &ptr_key);
As the CPU can perform speculative loads, we could have a situation
with the following:
CPU0 CPU1
---- ----
load ptr_key = 0
store ptr_key = random
smp_mb()
store have_filled_random_ptr_key
load have_filled_random_ptr_key = true
BAD BAD BAD! (you're so bad!)
Because nothing prevents CPU1 from loading ptr_key before loading
have_filled_random_ptr_key.
But this race is very unlikely, but we can't keep an incorrect smp_mb() in
place. Instead, replace the have_filled_random_ptr_key with a static_branch
not_filled_random_ptr_key, that is initialized to true and changed to false
when we get enough entropy. If the update happens in early boot, the
static_key is updated immediately, otherwise it will have to wait till
entropy is filled and this happens in an interrupt handler which can't
enable a static_key, as that requires a preemptible context. In that case, a
work_queue is used to enable it, as entropy already took too long to
establish in the first place waiting a little more shouldn't hurt anything.
The benefit of using the static key is that the unlikely branch in
vsprintf() now becomes a nop.
Link: http://lkml.kernel.org/r/20180515100558.21df515e@gandalf.local.home
Cc: stable@vger.kernel.org
Fixes: ad67b74d2469d ("printk: hash addresses printed with %p")
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS fixes from David Howells:
"Here's a set of patches that fix a number of bugs in the in-kernel AFS
client, including:
- Fix directory locking to not use individual page locks for
directory reading/scanning but rather to use a semaphore on the
afs_vnode struct as the directory contents must be read in a single
blob and data from different reads must not be mixed as the entire
contents may be shuffled about between reads.
- Fix address list parsing to handle port specifiers correctly.
- Only give up callback records on a server if we actually talked to
that server (we might not be able to access a server).
- Fix some callback handling bugs, including refcounting,
whole-volume callbacks and when callbacks actually get broken in
response to a CB.CallBack op.
- Fix some server/address rotation bugs, including giving up if we
can't probe a server; giving up if a server says it doesn't have a
volume, but there are more servers to try.
- Fix the decoding of fetched statuses to be OpenAFS compatible.
- Fix the handling of server lookups in Cache Manager ops (such as
CB.InitCallBackState3) to use a UUID if possible and to handle no
server being found.
- Fix a bug in server lookup where not all addresses are compared.
- Fix the non-encryption of calls that prevents some servers from
being accessed (this also requires an AF_RXRPC patch that has
already gone in through the net tree).
There's also a patch that adds tracepoints to log Cache Manager ops
that don't find a matching server, either by UUID or by address"
* tag 'afs-fixes-20180514' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Fix the non-encryption of calls
afs: Fix CB.CallBack handling
afs: Fix whole-volume callback handling
afs: Fix afs_find_server search loop
afs: Fix the handling of an unfound server in CM operations
afs: Add a tracepoint to record callbacks from unlisted servers
afs: Fix the handling of CB.InitCallBackState3 to find the server by UUID
afs: Fix VNOVOL handling in address rotation
afs: Fix AFSFetchStatus decoder to provide OpenAFS compatibility
afs: Fix server rotation's handling of fileserver probe failure
afs: Fix refcounting in callback registration
afs: Fix giving up callbacks on server destruction
afs: Fix address list parsing
afs: Fix directory page locking
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two small driver fixes: aacraid to fix an unknown IU type on task
management functions which causes a firmware fault and vmw_pvscsi to
change a return code to retry the operation instead of causing an
immediate error"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: aacraid: Correct hba_send to include iu_type
scsi: vmw-pvscsi: return DID_BUS_BUSY for adapter-initated aborts
|
|
git://people.freedesktop.org/~airlied/linux
Pull drm fix from Dave Airlie:
"This fixes the mmap regression reported to me on irc by an i686 kernel
user today, he's tested the fix works, and I've audited all the drm
drivers for the bad mmap usage and since we use the mmap offset as a
lookup in a table we aren't inclined to have anything bad in there"
[ See commit be83bbf80682 ("mmap: introduce sane default mmap limits")
for details and the note on why the GPU drivers were expected to be a
special case. - Linus ]
* tag 'drm-fixes-for-v4.17-rc6-urgent' of git://people.freedesktop.org/~airlied/linux:
drm: set FMODE_UNSIGNED_OFFSET for drm files
|
|
kvm_read_guest() will eventually look up in kvm_memslots(), which requires
either to hold the kvm->slots_lock or to be inside a kvm->srcu critical
section.
In contrast to x86 and s390 we don't take the SRCU lock on every guest
exit, so we have to do it individually for each kvm_read_guest() call.
Use the newly introduced wrapper for that.
Cc: Stable <stable@vger.kernel.org> # 4.12+
Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
kvm_read_guest() will eventually look up in kvm_memslots(), which requires
either to hold the kvm->slots_lock or to be inside a kvm->srcu critical
section.
In contrast to x86 and s390 we don't take the SRCU lock on every guest
exit, so we have to do it individually for each kvm_read_guest() call.
Provide a wrapper which does that and use that everywhere.
Note that ending the SRCU critical section before returning from the
kvm_read_guest() wrapper is safe, because the data has been *copied*, so
we don't need to rely on valid references to the memslot anymore.
Cc: Stable <stable@vger.kernel.org> # 4.8+
Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Apparently the development of update_affinity() overlapped with the
promotion of irq_lock to be _irqsave, so the patch didn't convert this
lock over. This will make lockdep complain.
Fix this by disabling IRQs around the lock.
Cc: stable@vger.kernel.org
Fixes: 08c9fd042117 ("KVM: arm/arm64: vITS: Add a helper to update the affinity of an LPI")
Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
As Jan reported [1], lockdep complains about the VGIC not being bullet
proof. This seems to be due to two issues:
- When commit 006df0f34930 ("KVM: arm/arm64: Support calling
vgic_update_irq_pending from irq context") promoted irq_lock and
ap_list_lock to _irqsave, we forgot two instances of irq_lock.
lockdeps seems to pick those up.
- If a lock is _irqsave, any other locks we take inside them should be
_irqsafe as well. So the lpi_list_lock needs to be promoted also.
This fixes both issues by simply making the remaining instances of those
locks _irqsave.
One irq_lock is addressed in a separate patch, to simplify backporting.
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/575718.html
Cc: stable@vger.kernel.org
Fixes: 006df0f34930 ("KVM: arm/arm64: Support calling vgic_update_irq_pending from irq context")
Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Anthoine reported:
The period used by Windows change over time but it can be 1
milliseconds or less. I saw the limit_periodic_timer_frequency
print so 500 microseconds is sometimes reached.
As suggested by Paolo, lower the default timer frequency limit to a
smaller interval of 200 us (5000 Hz) to leave some headroom. This
is required due to Windows 10 changing the scheduler tick limit
from 1024 Hz to 2048 Hz.
Reported-by: Anthoine Bourgeois <anthoine.bourgeois@blade-group.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Anthoine Bourgeois <anthoine.bourgeois@blade-group.com>
Cc: Darren Kenny <darren.kenny@oracle.com>
Cc: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Calling qdio_release_memory() on error is just plain wrong. It frees
the main qdio_irq struct, when following code still uses it.
Also, no other error path in qdio_establish() does this. So trust
callers to clean up via qdio_free() if some step of the QDIO
initialization fails.
Fixes: 779e6e1c724d ("[S390] qdio: new qdio driver.")
Cc: <stable@vger.kernel.org> #v2.6.27+
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Ever since CQ/QAOB support was added, calling qdio_free() straight after
qdio_alloc() results in qdio_release_memory() accessing uninitialized
memory (ie. q->u.out.use_cq and q->u.out.aobs). Followed by a
kmem_cache_free() on the random AOB addresses.
For older kernels that don't have 6e30c549f6ca, the same applies if
qdio_establish() fails in the DEV_STATE_ONLINE check.
While initializing q->u.out.use_cq would be enough to fix this
particular bug, the more future-proof change is to just zero-alloc the
whole struct.
Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks")
Cc: <stable@vger.kernel.org> #v3.2+
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
bmAtributes offset doesn't exist in the UAC3 CS_EP descriptor.
Hence, checking for pitch control as if it was UAC2 doesn't make
any sense. Use the defined UAC3 offsets instead.
Fixes: 9a2fe9b801f5 ("ALSA: usb: initial USB Audio Device Class 3.0 support")
Signed-off-by: Jorge Sanjuan <jorge.sanjuan@codethink.co.uk>
Reviewed-by: Ruslan Bilovol <ruslan.bilovol@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Since we have the ttm and gem vma managers using a subset
of the file address space for objects, and these start at
0x100000000 they will overflow the new mmap checks.
I've checked all the mmap routines I could see for any
bad behaviour but overall most people use GEM/TTM VMA
managers even the legacy drivers have a hashtable.
Reported-and-Tested-by: Arthur Marsh (amarsh04 on #radeon)
Fixes: be83bbf8068 (mmap: introduce sane default mmap limits)
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
trace_xen_mmu_flush_tlb{_all}
Doing an audit of trace events, I discovered two trace events in the xen
subsystem that use a hack to create zero data size trace events. This is not
what trace events are for. Trace events add memory footprint overhead, and
if all you need to do is see if a function is hit or not, simply make that
function noinline and use function tracer filtering.
Worse yet, the hack used was:
__array(char, x, 0)
Which creates a static string of zero in length. There's assumptions about
such constructs in ftrace that this is a dynamic string that is nul
terminated. This is not the case with these tracepoints and can cause
problems in various parts of ftrace.
Nuke the trace events!
Link: http://lkml.kernel.org/r/20180509144605.5a220327@gandalf.local.home
Cc: stable@vger.kernel.org
Fixes: 95a7d76897c1e ("xen/mmu: Use Xen specific TLB flush instead of the generic one.")
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Allow to pass the socket address structure with AF_UNSPEC family for
compatibility purposes. selinux_socket_bind() will further check it
for INADDR_ANY and selinux_socket_connect_helper() should return
EINVAL.
For a bad address family return EINVAL instead of AFNOSUPPORT error,
i.e. what is expected from SCTP protocol in such case.
Fixes: d452930fd3b9 ("selinux: Add SCTP support")
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Since sctp_bindx() and sctp_connectx() can have multiple addresses,
sk_family can differ from sa_family. Therefore, selinux_socket_bind()
and selinux_socket_connect_helper(), which process sockaddr structure
(address and port), should use the address family from that structure
too, and not from the socket one.
The initialization of the data for the audit record is moved above,
in selinux_socket_bind(), so that there is no duplicate changes and
code.
Fixes: d452930fd3b9 ("selinux: Add SCTP support")
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Commit d452930fd3b9 ("selinux: Add SCTP support") breaks compatibility
with the old programs that can pass sockaddr_in structure with AF_UNSPEC
and INADDR_ANY to bind(). As a result, bind() returns EAFNOSUPPORT error.
This was found with LTP/asapi_01 test.
Similar to commit 29c486df6a20 ("net: ipv4: relax AF_INET check in
bind()"), which relaxed AF_INET check for compatibility, add AF_UNSPEC
case to AF_INET and make sure that the address is INADDR_ANY.
Fixes: d452930fd3b9 ("selinux: Add SCTP support")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Some AFS servers refuse to accept unencrypted traffic, so can't be accessed
with kAFS. Set the AF_RXRPC security level to encrypt client calls to deal
with this.
Note that incoming service calls are set by the remote client and so aren't
affected by this.
This requires an AF_RXRPC patch to pass the value set by setsockopt to calls
begun by the kernel.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The handling of CB.CallBack messages sent by the fileserver to the client
is broken in that they are currently being processed after the reply has
been transmitted.
This is not what the fileserver expects, however. It holds up change
visibility until the reply comes so as to maintain cache coherency, and so
expects the client to have to refetch the state on the affected files.
Fix CB.CallBack handling to perform the callback break before sending the
reply.
The fileserver is free to hold up status fetches issued by other threads on
the same client that occur in reponse to the callback until any pending
changes have been committed.
Fixes: d001648ec7cf ("rxrpc: Don't expose skbs to in-kernel users [ver #2]")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
It's possible for an AFS file server to issue a whole-volume notification
that callbacks on all the vnodes in the file have been broken. This is
done for R/O and backup volumes (which don't have per-file callbacks) and
for things like a volume being taken offline.
Fix callback handling to detect whole-volume notifications, to track it
across operations and to check it during inode validation.
Fixes: c435ee34551e ("afs: Overhaul the callback handling")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The code that looks up servers by addresses makes the assumption
that the list of addresses for a server is sorted. It exits the
loop if it finds that the target address is larger than the
current candidate. As the list is not currently sorted, this
can lead to a failure to find a matching server, which can cause
callbacks from that server to be ignored.
Remove the early exit case so that the complete list is searched.
Fixes: d2ddc776a458 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
If the client cache manager operations that need the server record
(CB.Callback, CB.InitCallBackState, and CB.InitCallBackState3) can't find
the server record, they abort the call from the file server with
RX_CALL_DEAD when they should return okay.
Fixes: c35eccb1f614 ("[AFS]: Implement the CB.InitCallBackState3 operation.")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Add a tracepoint to record callbacks from servers for which we don't have a
record.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix the handling of the CB.InitCallBackState3 service call to find the
record of a server that we're using by looking it up by the UUID passed as
the parameter rather than by its address (of which it might have many, and
which may change).
Fixes: c35eccb1f614 ("[AFS]: Implement the CB.InitCallBackState3 operation.")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
If a volume location record lists multiple file servers for a volume, then
it's possible that due to a misconfiguration or a changing configuration
that one of the file servers doesn't know about it yet and will abort
VNOVOL. Currently, the rotation algorithm will stop with EREMOTEIO.
Fix this by moving on to try the next server if VNOVOL is returned. Once
all the servers have been tried and the record rechecked, the algorithm
will stop with EREMOTEIO or ENOMEDIUM.
Fixes: d2ddc776a458 ("afs: Overhaul volume and server record caching and fileserver rotation")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The OpenAFS server's RXAFS_InlineBulkStatus implementation has a bug
whereby if an error occurs on one of the vnodes being queried, then the
errorCode field is set correctly in the corresponding status, but the
interfaceVersion field is left unset.
Fix kAFS to deal with this by evaluating the AFSFetchStatus blob against
the following cases when called from FS.InlineBulkStatus delivery:
(1) If InterfaceVersion == 0 then:
(a) If errorCode != 0 then it indicates the abort code for the
corresponding vnode.
(b) If errorCode == 0 then the status record is invalid.
(2) If InterfaceVersion == 1 then:
(a) If errorCode != 0 then it indicates the abort code for the
corresponding vnode.
(b) If errorCode == 0 then the status record is valid and can be
parsed.
(3) If InterfaceVersion is anything else then the status record is
invalid.
Fixes: dd9fbcb8e103 ("afs: Rearrange status mapping")
Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The server rotation algorithm just gives up if it fails to probe a
fileserver. Fix this by rotating to the next fileserver instead.
Fixes: d2ddc776a458 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The refcounting on afs_cb_interest struct objects in
afs_register_server_cb_interest() is wrong as it uses the server list
entry's call back interest pointer without regard for the fact that it
might be replaced at any time and the object thrown away.
Fix this by:
(1) Put a lock on the afs_server_list struct that can be used to
mediate access to the callback interest pointers in the servers array.
(2) Keep a ref on the callback interest that we get from the entry.
(3) Dropping the old reference held by vnode->cb_interest if we replace
the pointer.
Fixes: c435ee34551e ("afs: Overhaul the callback handling")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
When a server record is destroyed, we want to send a message to the server
telling it that we're giving up all the callbacks it has promised us.
Apply two fixes to this:
(1) Only send the FS.GiveUpAllCallBacks message if we actually got a
callback from that server. We assume this to be the case if we
performed at least one successful FS operation on that server.
(2) Send it to the address last used for that server rather than always
picking the first address in the list (which might be unreachable).
Fixes: d2ddc776a458 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The parsing of port specifiers in the address list obtained from the DNS
resolution upcall doesn't work as in4_pton() and in6_pton() will fail on
encountering an unexpected delimiter (in this case, the '+' marking the
port number). However, in*_pton() can't be given multiple specifiers.
Fix this by finding the delimiter in advance and not relying on in*_pton()
to find the end of the address for us.
Fixes: 8b2a464ced77 ("afs: Add an address list concept")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The afs directory loading code (primarily afs_read_dir()) locks all the
pages that hold a directory's content blob to defend against
getdents/getdents races and getdents/lookup races where the competitors
issue conflicting reads on the same data. As the reads will complete
consecutively, they may retrieve different versions of the data and
one may overwrite the data that the other is busy parsing.
Fix this by not locking the pages at all, but rather by turning the
validation lock into an rwsem and getting an exclusive lock on it whilst
reading the data or validating the attributes and a shared lock whilst
parsing the data. Sharing the attribute validation lock should be fine as
the data fetch will retrieve the attributes also.
The individual page locks aren't needed at all as the only place they're
being used is to serialise data loading.
Without this patch, the:
if (!test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags)) {
...
}
part of afs_read_dir() may be skipped, leaving the pages unlocked when we
hit the success: clause - in which case we try to unlock the not-locked
pages, leading to the following oops:
page:ffffe38b405b4300 count:3 mapcount:0 mapping:ffff98156c83a978 index:0x0
flags: 0xfffe000001004(referenced|private)
raw: 000fffe000001004 ffff98156c83a978 0000000000000000 00000003ffffffff
raw: dead000000000100 dead000000000200 0000000000000001 ffff98156b27c000
page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
page->mem_cgroup:ffff98156b27c000
------------[ cut here ]------------
kernel BUG at mm/filemap.c:1205!
...
RIP: 0010:unlock_page+0x43/0x50
...
Call Trace:
afs_dir_iterate+0x789/0x8f0 [kafs]
? _cond_resched+0x15/0x30
? kmem_cache_alloc_trace+0x166/0x1d0
? afs_do_lookup+0x69/0x490 [kafs]
? afs_do_lookup+0x101/0x490 [kafs]
? key_default_cmp+0x20/0x20
? request_key+0x3c/0x80
? afs_lookup+0xf1/0x340 [kafs]
? __lookup_slow+0x97/0x150
? lookup_slow+0x35/0x50
? walk_component+0x1bf/0x490
? path_lookupat.isra.52+0x75/0x200
? filename_lookup.part.66+0xa0/0x170
? afs_end_vnode_operation+0x41/0x60 [kafs]
? __check_object_size+0x9c/0x171
? strncpy_from_user+0x4a/0x170
? vfs_statx+0x73/0xe0
? __do_sys_newlstat+0x39/0x70
? __x64_sys_getdents+0xc9/0x140
? __x64_sys_getdents+0x140/0x140
? do_syscall_64+0x5b/0x160
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: f3ddee8dc4e2 ("afs: Fix directory handling")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/pti updates from Thomas Gleixner:
"A mixed bag of fixes and updates for the ghosts which are hunting us.
The scheduler fixes have been pulled into that branch to avoid
conflicts.
- A set of fixes to address a khread_parkme() race which caused lost
wakeups and loss of state.
- A deadlock fix for stop_machine() solved by moving the wakeups
outside of the stopper_lock held region.
- A set of Spectre V1 array access restrictions. The possible
problematic spots were discuvered by Dan Carpenters new checks in
smatch.
- Removal of an unused file which was forgotten when the rest of that
functionality was removed"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Remove unused file
perf/x86/cstate: Fix possible Spectre-v1 indexing for pkg_msr
perf/x86/msr: Fix possible Spectre-v1 indexing in the MSR driver
perf/x86: Fix possible Spectre-v1 indexing for x86_pmu::event_map()
perf/x86: Fix possible Spectre-v1 indexing for hw_perf_event cache_*
perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[]
sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]
sched/core: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]
sched/core: Introduce set_special_state()
kthread, sched/wait: Fix kthread_parkme() completion issue
kthread, sched/wait: Fix kthread_parkme() wait-loop
sched/fair: Fix the update of blocked load when newly idle
stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Thomas Gleixner:
"Revert the new NUMA aware placement approach which turned out to
create more problems than it solved"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "sched/numa: Delay retrying placement for automatic NUMA balance after wake_affine()"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tooling fixes from Thomas Gleixner:
"Another small set of perf tooling fixes and updates:
- Revert "perf pmu: Fix pmu events parsing rule", as it broke Intel
PT event description parsing (Arnaldo Carvalho de Melo)
- Sync x86's cpufeatures.h and kvm UAPI headers with the kernel
sources, suppressing the ABI drift warnings (Arnaldo Carvalho de
Melo)
- Remove duplicated entry for westmereep-dp in Intel's mapfile.csv
(William Cohen)
- Fix typo in 'perf bench numa' options description (Yisheng Xie)"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "perf pmu: Fix pmu events parsing rule"
tools headers kvm: Sync ARM UAPI headers with the kernel sources
tools headers kvm: Sync uapi/linux/kvm.h with the kernel sources
tools headers: Sync x86 cpufeatures.h with the kernel sources
perf vendor events intel: Remove duplicated entry for westmereep-dp in mapfile.csv
perf bench numa: Fix typo in options
|
|
Pull dma-mapping fix from Christoph Hellwig:
"Just one little fix from Jean to avoid a harmless but very annoying
warning, especially for the drm code"
* tag 'dma-mapping-4.17-5' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: silent unwanted warning "buffer is full"
|
|
This adds support for the P950ER, which has the same required fixup as
the P950HR, but has a different PCI ID.
Signed-off-by: Jeremy Soller <jeremy@system76.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Currently it's not possible to set volume lower than 26% (it just mutes).
Also fixes this warning:
Warning! Unlikely big volume range (=9472), cval->res is probably wrong.
[13] FU [PCM Playback Volume] ch = 2, val = -9473/-1/1
, and volume works fine for full range.
Signed-off-by: Federico Cuello <fedux@fedux.com.ar>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Power-saving is causing loud plops on the Lenovo C50 All in one, add it
to the blacklist.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1572975
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
In snd_ctl_elem_add_compat(), the fields of the struct 'data' need to be
copied from the corresponding fields of the struct 'data32' in userspace.
This is achieved by invoking copy_from_user() and get_user() functions. The
problem here is that the 'type' field is copied twice. One is by
copy_from_user() and one is by get_user(). Given that the 'type' field is
not used between the two copies, the second copy is *completely* redundant
and should be removed for better performance and cleanup. Also, these two
copies can cause inconsistent data: as the struct 'data32' resides in
userspace and a malicious userspace process can race to change the 'type'
field between the two copies to cause inconsistent data. Depending on how
the data is used in the future, such an inconsistency may cause potential
security risks.
For above reasons, we should take out the second copy.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|