Age | Commit message (Collapse) | Author | Files | Lines |
|
New pt_regs should indicate that there's no syscall, not that there's
syscall #0. While at it wrap macro body in do/while and parenthesize
macro arguments.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
|
|
Syscall may alter pt_regs structure passed to it, resulting in a
mismatch between syscall entry end syscall exit entries in the ftrace.
Temporary restore syscall field of the pt_regs for the duration of
do_syscall_trace_leave.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
|
|
On AMD processors, the detection of an overflowed PMC counter in the NMI
handler relies on the current value of the PMC. So, for example, to check
for overflow on a 48-bit counter, bit 47 is checked to see if it is 1 (not
overflowed) or 0 (overflowed).
When the perf NMI handler executes it does not know in advance which PMC
counters have overflowed. As such, the NMI handler will process all active
PMC counters that have overflowed. NMI latency in newer AMD processors can
result in multiple overflowed PMC counters being processed in one NMI and
then a subsequent NMI, that does not appear to be a back-to-back NMI, not
finding any PMC counters that have overflowed. This may appear to be an
unhandled NMI resulting in either a panic or a series of messages,
depending on how the kernel was configured.
To mitigate this issue, add an AMD handle_irq callback function,
amd_pmu_handle_irq(), that will invoke the common x86_pmu_handle_irq()
function and upon return perform some additional processing that will
indicate if the NMI has been handled or would have been handled had an
earlier NMI not handled the overflowed PMC. Using a per-CPU variable, a
minimum value of the number of active PMCs or 2 will be set whenever a
PMC is active. This is used to indicate the possible number of NMIs that
can still occur. The value of 2 is used for when an NMI does not arrive
at the LAPIC in time to be collapsed into an already pending NMI. Each
time the function is called without having handled an overflowed counter,
the per-CPU value is checked. If the value is non-zero, it is decremented
and the NMI indicates that it handled the NMI. If the value is zero, then
the NMI indicates that it did not handle the NMI.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # 4.14.x-
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/Message-ID:
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
On AMD processors, the detection of an overflowed counter in the NMI
handler relies on the current value of the counter. So, for example, to
check for overflow on a 48 bit counter, bit 47 is checked to see if it
is 1 (not overflowed) or 0 (overflowed).
There is currently a race condition present when disabling and then
updating the PMC. Increased NMI latency in newer AMD processors makes this
race condition more pronounced. If the counter value has overflowed, it is
possible to update the PMC value before the NMI handler can run. The
updated PMC value is not an overflowed value, so when the perf NMI handler
does run, it will not find an overflowed counter. This may appear as an
unknown NMI resulting in either a panic or a series of messages, depending
on how the kernel is configured.
To eliminate this race condition, the PMC value must be checked after
disabling the counter. Add an AMD function, amd_pmu_disable_all(), that
will wait for the NMI handler to reset any active and overflowed counter
after calling x86_pmu_disable_all().
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # 4.14.x-
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/Message-ID:
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Stephane reported that the TFA MSR is not initialized by the kernel,
but the TFA bit could set by firmware or as a leftover from a kexec,
which makes the state inconsistent.
Reported-by: Stephane Eranian <eranian@google.com>
Tested-by: Nelson DSouza <nelson.dsouza@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: tonyj@suse.com
Link: https://lkml.kernel.org/r/20190321123849.GN6521@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When an event is programmed with attr.wakeup_events=N (N>0), it means
the caller is interested in getting a user level notification after
N samples have been recorded in the kernel sampling buffer.
With precise events on Intel processors, the kernel uses PEBS.
The kernel tries minimize sampling overhead by verifying
if the event configuration is compatible with multi-entry PEBS mode.
If so, the kernel is notified only when the buffer has reached its threshold.
Other PEBS operates in single-entry mode, the kenrel is notified for each
PEBS sample.
The problem is that the current implementation look at frequency
mode and event sample_type but ignores the wakeup_events field. Thus,
it may not be possible to receive a notification after each precise event.
This patch fixes this problem by disabling multi-entry PEBS if wakeup_events
is non-zero.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Link: https://lkml.kernel.org/r/20190306195048.189514-1-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The user can control the MBA memory bandwidth in MBps (Mega
Bytes per second) units of the MBA Software Controller (mba_sc)
by using the "mba_MBps" mount option. For details, see
Documentation/x86/resctrl_ui.txt.
However, commit
23bf1b6be9c2 ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context")
changed the mount option name from "mba_MBps" to "mba_mpbs" by mistake.
Change it back from to "mba_MBps" because it is user-visible, and
correct "Opt_mba_mpbs" spelling to "Opt_mba_mbps".
[ bp: massage commit message. ]
Fixes: 23bf1b6be9c2 ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context")
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: dhowells@redhat.com
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: pei.p.jia@intel.com
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1553896238-22130-1-git-send-email-xiaochen.shen@intel.com
|
|
|
|
Commit 0df977eafc79 ("powerpc/6xx: Don't use SPRN_SPRG2 for storing
stack pointer while in RTAS") changes the code to use a field in
thread struct to store the stack pointer while in RTAS instead of
using SPRN_SPRG2. It therefore converts all places which were
manipulating SPRN_SPRG2 to use that field. During early startup, the
zeroing of SPRN_SPRG2 has been replaced by a zeroing of that field in
thread struct. But at least in start_here, that's done wrongly because
it used the physical address of the fields while MMU is on at that
time.
So the virtual address of the field should be used instead, but in
the meantime, thread struct has already been zeroed and initialised
so we can just drop this initialisation.
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Fixes: 0df977eafc79 ("powerpc/6xx: Don't use SPRN_SPRG2 for storing stack pointer while in RTAS")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Pull KVM fixes from Paolo Bonzini:
"A collection of x86 and ARM bugfixes, and some improvements to
documentation.
On top of this, a cleanup of kvm_para.h headers, which were exported
by some architectures even though they not support KVM at all. This is
responsible for all the Kbuild changes in the diffstat"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
Documentation: kvm: clarify KVM_SET_USER_MEMORY_REGION
KVM: doc: Document the life cycle of a VM and its resources
KVM: selftests: complete IO before migrating guest state
KVM: selftests: disable stack protector for all KVM tests
KVM: selftests: explicitly disable PIE for tests
KVM: selftests: assert on exit reason in CR4/cpuid sync test
KVM: x86: update %rip after emulating IO
x86/kvm/hyper-v: avoid spurious pending stimer on vCPU init
kvm/x86: Move MSR_IA32_ARCH_CAPABILITIES to array emulated_msrs
KVM: x86: Emulate MSR_IA32_ARCH_CAPABILITIES on AMD hosts
kvm: don't redefine flags as something else
kvm: mmu: Used range based flushing in slot_handle_level_range
KVM: export <linux/kvm_para.h> and <asm/kvm_para.h> iif KVM is supported
KVM: x86: remove check on nr_mmu_pages in kvm_arch_commit_memory_region()
kvm: nVMX: Add a vmentry check for HOST_SYSENTER_ESP and HOST_SYSENTER_EIP fields
KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)
KVM: Reject device ioctls from processes other than the VM's creator
KVM: doc: Fix incorrect word ordering regarding supported use of APIs
KVM: x86: fix handling of role.cr4_pae and rename it to 'gpte_size'
KVM: nVMX: Do not inherit quadrant and invalid for the root shadow EPT
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A pile of x86 updates:
- Prevent exceeding he valid physical address space in the /dev/mem
limit checks.
- Move all header content inside the header guard to prevent compile
failures.
- Fix the bogus __percpu annotation in this_cpu_has() which makes
sparse very noisy.
- Disable switch jump tables completely when retpolines are enabled.
- Prevent leaking the trampoline address"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/realmode: Make set_real_mode_mem() static inline
x86/cpufeature: Fix __percpu annotation in this_cpu_has()
x86/mm: Don't exceed the valid physical address space
x86/retpolines: Disable switch jump tables when retpolines are enabled
x86/realmode: Don't leak the trampoline kernel address
x86/boot: Fix incorrect ifdeffery scope
x86/resctrl: Remove unused variable
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug fixes from Thomas Gleixner:
"Two SMT/hotplug related fixes:
- Prevent crash when HOTPLUG_CPU is disabled and the CPU bringup
aborts. This is triggered with the 'nosmt' command line option, but
can happen by any abort condition. As the real unplug code is not
compiled in, prevent the fail by keeping the CPU in zombie state.
- Enforce HOTPLUG_CPU for SMP on x86 to avoid the above situation
completely. With 'nosmt' being a popular option it's required to
unplug the half brought up sibling CPUs (due to the MCE wreckage)
completely"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/smp: Enforce CONFIG_HOTPLUG_CPU when SMP=y
cpu/hotplug: Prevent crash when CPU bringup fails on CONFIG_HOTPLUG_CPU=n
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Three non-regression fixes.
- Our optimised memcmp could read past the end of one of the buffers
and potentially trigger a page fault leading to an oops.
- Some of our code to read energy management data on PowerVM had an
endian bug leading to bogus results.
- When reporting a machine check exception we incorrectly reported
TLB multihits as D-Cache multhits due to a missing entry in the
array of causes.
Thanks to: Chandan Rajendra, Gautham R. Shenoy, Mahesh Salgaonkar,
Segher Boessenkool, Vaidyanathan Srinivasan"
* tag 'powerpc-5.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/pseries/mce: Fix misleading print for TLB mutlihit
powerpc/pseries/energy: Use OF accessor functions to read ibm,drc-indexes
powerpc/64: Fix memcmp reading past the end of src/dest
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Use memblock_alloc() instead of memblock_alloc_low() in
request_standard_resources(), the latter being limited to the low 4G
memory range on arm64"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: replace memblock_alloc_low with memblock_alloc
|
|
Remove the unused @size argument and move it into a header file, so it
can be inlined.
[ bp: Massage. ]
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-efi <linux-efi@vger.kernel.org>
Cc: platform-driver-x86@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190328114233.27835-1-mcroce@redhat.com
|
|
IS_ENABLED should generally use CONFIG_ prefaced symbols and
it doesn't appear as if there is a CMODEL_MEDLOW define.
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
The FIXMAP area overlaps with VMALLOC area in Linux-5.1-rc1 hence we get
below warning in Linux RISC-V 32bit kernel. This warning does not show-up
in Linux RISC-V 64bit kernel due to large VMALLOC area.
WARNING: CPU: 0 PID: 22 at mm/vmalloc.c:150 vmap_page_range_noflush+0x134/0x15c
Modules linked in:
CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.1.0-rc1-00005-gebc2f658040e #1
Workqueue: events pcpu_balance_workfn
Call Trace:
[<c002b950>] walk_stackframe+0x0/0xa0
[<c002baac>] show_stack+0x28/0x32
[<c0587354>] dump_stack+0x62/0x7e
[<c002fdee>] __warn+0x98/0xce
[<c002fe52>] warn_slowpath_null+0x2e/0x3c
[<c00e71ce>] vmap_page_range_noflush+0x134/0x15c
[<c00e7886>] map_kernel_range_noflush+0xc/0x14
[<c00d54b8>] pcpu_populate_chunk+0x19e/0x236
[<c00d610e>] pcpu_balance_workfn+0x448/0x464
[<c00408d6>] process_one_work+0x16c/0x2ea
[<c0040b46>] worker_thread+0xf2/0x3b2
[<c004519a>] kthread+0xce/0xdc
[<c002a974>] ret_from_exception+0x0/0xc
This patch fixes above warning by placing FIXMAP area below VMALLOC area.
Fixes: f2c17aabc917 ("RISC-V: Implement compile-time fixed mappings")
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
On pseries, TLB multihit are reported as D-Cache Multihit. This is because
the wrongly populated mc_err_types[] array. Per PAPR, TLB error type is 0x04
and mc_err_types[4] points to "D-Cache" instead of "TLB" string. Fixup the
mc_err_types[] array.
Machine check error type per PAPR:
0x00 = Uncorrectable Memory Error (UE)
0x01 = SLB error
0x02 = ERAT Error
0x04 = TLB error
0x05 = D-Cache error
0x07 = I-Cache error
Fixes: 8f0b80561f21 ("powerpc/pseries: Display machine check error details.")
Cc: stable@vger.kernel.org # v4.20+
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
smatch complaint:
arch/mips/sgi-ip27/ip27-irq.c:123 shutdown_bridge_irq()
warn: variable dereferenced before check 'hd' (see line 121)
Fix it by removing local variable and use hd->pin directly.
Fixes: 69a07a41d908 ("MIPS: SGI-IP27: rework HUB interrupts")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
|
|
KGDB_call_nmi_hook is called by other cpu through smp call.
MIPS smp call is processed in ipi irq handler and regs is saved in
handle_int.
So kgdb_call_nmi_hook get regs by get_irq_regs and regs will be passed
to kgdb_cpu_enter.
Signed-off-by: Chong Qiao <qiaochong@loongson.cn>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: QiaoChong <qiaochong@loongson.cn>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
KVM/ARM fixes for 5.1
- Fix THP handling in the presence of pre-existing PTEs
- Honor request for PTE mappings even when THPs are available
- GICv4 performance improvement
- Take the srcu lock when writing to guest-controlled ITS data structures
- Reset the virtual PMU in preemptible context
- Various cleanups
|
|
Most (all?) x86 platforms provide a port IO based reset mechanism, e.g.
OUT 92h or CF9h. Userspace may emulate said mechanism, i.e. reset a
vCPU in response to KVM_EXIT_IO, without explicitly announcing to KVM
that it is doing a reset, e.g. Qemu jams vCPU state and resumes running.
To avoid corruping %rip after such a reset, commit 0967b7bf1c22 ("KVM:
Skip pio instruction when it is emulated, not executed") changed the
behavior of PIO handlers, i.e. today's "fast" PIO handling to skip the
instruction prior to exiting to userspace. Full emulation doesn't need
such tricks becase re-emulating the instruction will naturally handle
%rip being changed to point at the reset vector.
Updating %rip prior to executing to userspace has several drawbacks:
- Userspace sees the wrong %rip on the exit, e.g. if PIO emulation
fails it will likely yell about the wrong address.
- Single step exits to userspace for are effectively dropped as
KVM_EXIT_DEBUG is overwritten with KVM_EXIT_IO.
- Behavior of PIO emulation is different depending on whether it
goes down the fast path or the slow path.
Rather than skip the PIO instruction before exiting to userspace,
snapshot the linear %rip and cancel PIO completion if the current
value does not match the snapshot. For a 64-bit vCPU, i.e. the most
common scenario, the snapshot and comparison has negligible overhead
as VMCS.GUEST_RIP will be cached regardless, i.e. there is no extra
VMREAD in this case.
All other alternatives to snapshotting the linear %rip that don't
rely on an explicit reset announcenment suffer from one corner case
or another. For example, canceling PIO completion on any write to
%rip fails if userspace does a save/restore of %rip, and attempting to
avoid that issue by canceling PIO only if %rip changed then fails if PIO
collides with the reset %rip. Attempting to zero in on the exact reset
vector won't work for APs, which means adding more hooks such as the
vCPU's MP_STATE, and so on and so forth.
Checking for a linear %rip match technically suffers from corner cases,
e.g. userspace could theoretically rewrite the underlying code page and
expect a different instruction to execute, or the guest hardcodes a PIO
reset at 0xfffffff0, but those are far, far outside of what can be
considered normal operation.
Fixes: 432baf60eee3 ("KVM: VMX: use kvm_fast_pio_in for handling IN I/O")
Cc: <stable@vger.kernel.org>
Reported-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When userspace initializes guest vCPUs it may want to zero all supported
MSRs including Hyper-V related ones including HV_X64_MSR_STIMERn_CONFIG/
HV_X64_MSR_STIMERn_COUNT. With commit f3b138c5d89a ("kvm/x86: Update SynIC
timers on guest entry only") we began doing stimer_mark_pending()
unconditionally on every config change.
The issue I'm observing manifests itself as following:
- Qemu writes 0 to STIMERn_{CONFIG,COUNT} MSRs and marks all stimers as
pending in stimer_pending_bitmap, arms KVM_REQ_HV_STIMER;
- kvm_hv_has_stimer_pending() starts returning true;
- kvm_vcpu_has_events() starts returning true;
- kvm_arch_vcpu_runnable() starts returning true;
- when kvm_arch_vcpu_ioctl_run() gets into
(vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED) case:
- kvm_vcpu_block() gets in 'kvm_vcpu_check_block(vcpu) < 0' and returns
immediately, avoiding normal wait path;
- -EAGAIN is returned from kvm_arch_vcpu_ioctl_run() immediately forcing
userspace to retry.
So instead of normal wait path we get a busy loop on all secondary vCPUs
before they get INIT signal. This seems to be undesirable, especially given
that this happens even when Hyper-V extensions are not used.
Generally, it seems to be pointless to mark an stimer as pending in
stimer_pending_bitmap and arm KVM_REQ_HV_STIMER as the only thing
kvm_hv_process_stimers() will do is clear the corresponding bit. We may
just not mark disabled timers as pending instead.
Fixes: f3b138c5d89a ("kvm/x86: Update SynIC timers on guest entry only")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Since MSR_IA32_ARCH_CAPABILITIES is emualted unconditionally even if
host doesn't suppot it. We should move it to array emulated_msrs from
arry msrs_to_save, to report to userspace that guest support this msr.
Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The CPUID flag ARCH_CAPABILITIES is unconditioinally exposed to host
userspace for all x86 hosts, i.e. KVM advertises ARCH_CAPABILITIES
regardless of hardware support under the pretense that KVM fully
emulates MSR_IA32_ARCH_CAPABILITIES. Unfortunately, only VMX hosts
handle accesses to MSR_IA32_ARCH_CAPABILITIES (despite KVM_GET_MSRS
also reporting MSR_IA32_ARCH_CAPABILITIES for all hosts).
Move the MSR_IA32_ARCH_CAPABILITIES handling to common x86 code so
that it's emulated on AMD hosts.
Fixes: 1eaafe91a0df4 ("kvm: x86: IA32_ARCH_CAPABILITIES is always supported")
Cc: stable@vger.kernel.org
Reported-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Replace kvm_flush_remote_tlbs with kvm_flush_remote_tlbs_with_address
in slot_handle_level_range. When range based flushes are not enabled
kvm_flush_remote_tlbs_with_address falls back to kvm_flush_remote_tlbs.
This changes the behavior of many functions that indirectly use
slot_handle_level_range, iff the range based flushes are enabled. The
only potential problem I see with this is that kvm->tlbs_dirty will be
cleared less often, however the only caller of slot_handle_level_range that
checks tlbs_dirty is kvm_mmu_notifier_invalidate_range_start which
checks it and does a kvm_flush_remote_tlbs after calling
kvm_unmap_hva_range anyway.
Tested: Ran all kvm-unit-tests on a Intel Haswell machine with and
without this patch. The patch introduced no new failures.
Signed-off-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
I do not see any consistency about headers_install of <linux/kvm_para.h>
and <asm/kvm_para.h>.
According to my analysis of Linux 5.1-rc1, there are 3 groups:
[1] Both <linux/kvm_para.h> and <asm/kvm_para.h> are exported
alpha, arm, hexagon, mips, powerpc, s390, sparc, x86
[2] <asm/kvm_para.h> is exported, but <linux/kvm_para.h> is not
arc, arm64, c6x, h8300, ia64, m68k, microblaze, nios2, openrisc,
parisc, sh, unicore32, xtensa
[3] Neither <linux/kvm_para.h> nor <asm/kvm_para.h> is exported
csky, nds32, riscv
This does not match to the actual KVM support. At least, [2] is
half-baked.
Nor do arch maintainers look like they care about this. For example,
commit 0add53713b1c ("microblaze: Add missing kvm_para.h to Kbuild")
exported <asm/kvm_para.h> to user-space in order to fix an in-kernel
build error.
We have two ways to make this consistent:
[A] export both <linux/kvm_para.h> and <asm/kvm_para.h> for all
architectures, irrespective of the KVM support
[B] Match the header export of <linux/kvm_para.h> and <asm/kvm_para.h>
to the KVM support
My first attempt was [A] because the code looks cleaner, but Paolo
suggested [B].
So, this commit goes with [B].
For most architectures, <asm/kvm_para.h> was moved to the kernel-space.
I changed include/uapi/linux/Kbuild so that it checks generated
asm/kvm_para.h as well as check-in ones.
After this commit, there will be two groups:
[1] Both <linux/kvm_para.h> and <asm/kvm_para.h> are exported
arm, arm64, mips, powerpc, s390, x86
[2] Neither <linux/kvm_para.h> nor <asm/kvm_para.h> is exported
alpha, arc, c6x, csky, h8300, hexagon, ia64, m68k, microblaze,
nds32, nios2, openrisc, parisc, riscv, sh, sparc, unicore32, xtensa
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
* nr_mmu_pages would be non-zero only if kvm->arch.n_requested_mmu_pages is
non-zero.
* nr_mmu_pages is always non-zero, since kvm_mmu_calculate_mmu_pages()
never return zero.
Based on these two reasons, we can merge the two *if* clause and use the
return value from kvm_mmu_calculate_mmu_pages() directly. This simplify
the code and also eliminate the possibility for reader to believe
nr_mmu_pages would be zero.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
fields
According to section "Checks on VMX Controls" in Intel SDM vol 3C, the
following check is performed on vmentry of L2 guests:
On processors that support Intel 64 architecture, the IA32_SYSENTER_ESP
field and the IA32_SYSENTER_EIP field must each contain a canonical
address.
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Errata#1096:
On a nested data page fault when CR.SMAP=1 and the guest data read
generates a SMAP violation, GuestInstrBytes field of the VMCB on a
VMEXIT will incorrectly return 0h instead the correct guest
instruction bytes .
Recommend Workaround:
To determine what instruction the guest was executing the hypervisor
will have to decode the instruction at the instruction pointer.
The recommended workaround can not be implemented for the SEV
guest because guest memory is encrypted with the guest specific key,
and instruction decoder will not be able to decode the instruction
bytes. If we hit this errata in the SEV guest then log the message
and request a guest shutdown.
Reported-by: Venkatesh Srinivas <venkateshs@google.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The cr4_pae flag is a bit of a misnomer, its purpose is really to track
whether the guest PTE that is being shadowed is a 4-byte entry or an
8-byte entry. Prior to supporting nested EPT, the size of the gpte was
reflected purely by CR4.PAE. KVM fudged things a bit for direct sptes,
but it was mostly harmless since the size of the gpte never mattered.
Now that a spte may be tracking an indirect EPT entry, relying on
CR4.PAE is wrong and ill-named.
For direct shadow pages, force the gpte_size to '1' as they are always
8-byte entries; EPT entries can only be 8-bytes and KVM always uses
8-byte entries for NPT and its identity map (when running with EPT but
not unrestricted guest).
Likewise, nested EPT entries are always 8-bytes. Nested EPT presents a
unique scenario as the size of the entries are not dictated by CR4.PAE,
but neither is the shadow page a direct map. To handle this scenario,
set cr0_wp=1 and smap_andnot_wp=1, an otherwise impossible combination,
to denote a nested EPT shadow page. Use the information to avoid
incorrectly zapping an unsync'd indirect page in __kvm_sync_page().
Providing a consistent and accurate gpte_size fixes a bug reported by
Vitaly where fast_cr3_switch() always fails when switching from L2 to
L1 as kvm_mmu_get_page() would force role.cr4_pae=0 for direct pages,
whereas kvm_calc_mmu_role_common() would set it according to CR4.PAE.
Fixes: 7dcd575520082 ("x86/kvm/mmu: check if tdp/shadow MMU reconfiguration is needed")
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Explicitly zero out quadrant and invalid instead of inheriting them from
the root_mmu. Functionally, this patch is a nop as we (should) never
set quadrant for a direct mapped (EPT) root_mmu and nested EPT is only
allowed if EPT is used for L1, and the root_mmu will never be invalid at
this point.
Explicitly setting flags sets the stage for repurposing the legacy
paging bits in role, e.g. nxe, cr0_wp, and sm{a,e}p_andnot_wp, at which
point 'smm' would be the only flag to be inherited from root_mmu.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
&cpu_info.x86_capability is __percpu, and the second argument of
x86_this_cpu_test_bit() is expected to be __percpu. Don't cast the
__percpu away and then implicitly add it again. This gets rid of 106
lines of sparse warnings with the kernel config I'm using.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190328154948.152273-1-jannh@google.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"Improvements and bug fixes for 5.1-rc2:
- Fix early free of the channel program in vfio
- On AP device removal make sure that all messages are flushed with
the driver still attached that queued the message
- Limit brk randomization to 32MB to reduce the chance that the heap
of ld.so is placed after the main stack
- Add a rolling average for the steal time of a CPU, this will be
needed for KVM to decide when to do busy waiting
- Fix a warning in the CPU-MF code
- Add a notification handler for AP configuration change to react
faster to new AP devices"
* tag 's390-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cpumf: Fix warning from check_processor_id
zcrypt: handle AP Info notification from CHSC SEI command
vfio: ccw: only free cp on final interrupt
s390/vtime: steal time exponential moving average
s390/zcrypt: revisit ap device remove procedure
s390: limit brk randomization to 32MB
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"A couple of minor fixes only for now
- fix for incorrect DMA channels on Renesas R-Car
- Broadcom bcm2835 error handling fixes
- Kconfig dependency fixes for bcm2835 and davinci
- CPU idle wakeup fix for i.MX6
- MMC regression on Tegra186
- fix incorrect phy settings on one imx board"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
arm64: tegra: Disable CQE Support for SDMMC4 on Tegra186
ARM: dts: nomadik: Fix polarity of SPI CS
ARM: davinci: fix build failure with allnoconfig
ARM: imx_v4_v5_defconfig: enable PWM driver
ARM: imx_v6_v7_defconfig: continue compiling the pwm driver
ARM: dts: imx6dl-yapp4: Use correct pseudo PHY address for the switch
ARM: dts: imx6qdl: Fix typo in imx6qdl-icore-rqs.dtsi
ARM: dts: imx6ull: Use the correct style for SPDX License Identifier
ARM: dts: pfla02: increase phy reset duration
ARM: imx6q: cpuidle: fix bug that CPU might not wake up at expected time
ARM: imx51: fix a leaked reference by adding missing of_node_put
ARM: dts: imx6dl-yapp4: Use rgmii-id phy mode on the cpu port
arm64: bcm2835: Add missing dependency on MFD_CORE.
ARM: dts: bcm283x: Fix hdmi hpd gpio pull
soc: bcm: bcm2835-pm: Fix error paths of initialization.
soc: bcm: bcm2835-pm: Fix PM_IMAGE_PERI power domain support.
arm64: dts: renesas: r8a774c0: Fix SCIF5 DMA channels
arm64: dts: renesas: r8a77990: Fix SCIF5 DMA channels
|
|
valid_phys_addr_range() is used to sanity check the physical address range
of an operation, e.g., access to /dev/mem. It uses __pa(high_memory)
internally.
If memory is populated at the end of the physical address space, then
__pa(high_memory) is outside of the physical address space because:
high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) + 1;
For the comparison in valid_phys_addr_range() this is not an issue, but if
CONFIG_DEBUG_VIRTUAL is enabled, __pa() maps to __phys_addr(), which
verifies that the resulting physical address is within the valid physical
address space of the CPU. So in the case that memory is populated at the
end of the physical address space, this is not true and triggers a
VIRTUAL_BUG_ON().
Use __pa(high_memory - 1) to prevent the conversion from going beyond
the end of valid physical addresses.
Fixes: be62a3204406 ("x86/mm: Limit mmap() of /dev/mem to valid physical addresses")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Craig Bergstrom <craigb@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Sean Young <sean@mess.org>
Link: https://lkml.kernel.org/r/20190326001817.15413-2-rcampbell@nvidia.com
|
|
Commit ce02ef06fcf7 ("x86, retpolines: Raise limit for generating indirect
calls from switch-case") raised the limit under retpolines to 20 switch
cases where gcc would only then start to emit jump tables, and therefore
effectively disabling the emission of slow indirect calls in this area.
After this has been brought to attention to gcc folks [0], Martin Liska
has then fixed gcc to align with clang by avoiding to generate switch jump
tables entirely under retpolines. This is taking effect in gcc starting
from stable version 8.4.0. Given kernel supports compilation with older
versions of gcc where the fix is not being available or backported anymore,
we need to keep the extra KBUILD_CFLAGS around for some time and generally
set the -fno-jump-tables to align with what more recent gcc is doing
automatically today.
More than 20 switch cases are not expected to be fast-path critical, but
it would still be good to align with gcc behavior for versions < 8.4.0 in
order to have consistency across supported gcc versions. vmlinux size is
slightly growing by 0.27% for older gcc. This flag is only set to work
around affected gcc, no change for clang.
[0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952
Suggested-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Björn Töpel<bjorn.topel@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Link: https://lkml.kernel.org/r/20190325135620.14882-1-daniel@iogearbox.net
|
|
The SMT disable 'nosmt' command line argument is not working properly when
CONFIG_HOTPLUG_CPU is disabled. The teardown of the sibling CPUs which are
required to be brought up due to the MCE issues, cannot work. The CPUs are
then kept in a half dead state.
As the 'nosmt' functionality has become popular due to the speculative
hardware vulnerabilities, the half torn down state is not a proper solution
to the problem.
Enforce CONFIG_HOTPLUG_CPU=y when SMP is enabled so the full operation is
possible.
Reported-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Mukesh Ojha <mojha@codeaurora.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Rik van Riel <riel@surriel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Micheal Kelley <michael.h.kelley@microsoft.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190326163811.598166056@linutronix.de
|
|
The function argument for the ISC_D0 on PC9 was incorrect. According to
the documentation it should be 'C' aka 3.
Signed-off-by: David Engraf <david.engraf@sysgo.com>
Reviewed-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Fixes: 7f16cb676c00 ("ARM: at91/dt: add sama5d2 pinmux")
Cc: <stable@vger.kernel.org> # v4.4+
|
|
Function __hw_perf_event_init() used a CPU variable without
ensuring CPU preemption has been disabled. This caused the
following warning in the kernel log:
[ 7.277085] BUG: using smp_processor_id() in preemptible
[00000000] code: cf-csdiag/1892
[ 7.277111] caller is cf_diag_event_init+0x13a/0x338
[ 7.277122] CPU: 10 PID: 1892 Comm: cf-csdiag Not tainted
5.0.0-20190318.rc0.git0.9e1a11e0f602.300.fc29.s390x+debug #1
[ 7.277131] Hardware name: IBM 2964 NC9 712 (LPAR)
[ 7.277139] Call Trace:
[ 7.277150] ([<000000000011385a>] show_stack+0x82/0xd0)
[ 7.277161] [<0000000000b7a71a>] dump_stack+0x92/0xd0
[ 7.277174] [<00000000007b7e9c>] check_preemption_disabled+0xe4/0x100
[ 7.277183] [<00000000001228aa>] cf_diag_event_init+0x13a/0x338
[ 7.277195] [<00000000002cf3aa>] perf_try_init_event+0x72/0xf0
[ 7.277204] [<00000000002d0bba>] perf_event_alloc+0x6fa/0xce0
[ 7.277214] [<00000000002dc4a8>] __s390x_sys_perf_event_open+0x398/0xd50
[ 7.277224] [<0000000000b9e8f0>] system_call+0xdc/0x2d8
[ 7.277233] 2 locks held by cf-csdiag/1892:
[ 7.277241] #0: 00000000976f5510 (&sig->cred_guard_mutex){+.+.},
at: __s390x_sys_perf_event_open+0xd2e/0xd50
[ 7.277257] #1: 00000000363b11bd (&pmus_srcu){....},
at: perf_event_alloc+0x52e/0xce0
The variable is now accessed in proper context. Use
get_cpu_var()/put_cpu_var() pair to disable
preemption during access.
As the hardware authorization settings apply to all CPUs, it
does not matter which CPU is used to check the authorization setting.
Remove the event->count assignment. It is not needed as function
perf_event_alloc() allocates memory for the event with kzalloc() and
thus count is already set to zero.
Fixes: fe5908bccc56 ("s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Pull networking fixes from David Miller:
"Fixes here and there, a couple new device IDs, as usual:
1) Fix BQL race in dpaa2-eth driver, from Ioana Ciornei.
2) Fix 64-bit division in iwlwifi, from Arnd Bergmann.
3) Fix documentation for some eBPF helpers, from Quentin Monnet.
4) Some UAPI bpf header sync with tools, also from Quentin Monnet.
5) Set descriptor ownership bit at the right time for jumbo frames in
stmmac driver, from Aaro Koskinen.
6) Set IFF_UP properly in tun driver, from Eric Dumazet.
7) Fix load/store doubleword instruction generation in powerpc eBPF
JIT, from Naveen N. Rao.
8) nla_nest_start() return value checks all over, from Kangjie Lu.
9) Fix asoc_id handling in SCTP after the SCTP_*_ASSOC changes this
merge window. From Marcelo Ricardo Leitner and Xin Long.
10) Fix memory corruption with large MTUs in stmmac, from Aaro
Koskinen.
11) Do not use ipv4 header for ipv6 flows in TCP and DCCP, from Eric
Dumazet.
12) Fix topology subscription cancellation in tipc, from Erik Hugne.
13) Memory leak in genetlink error path, from Yue Haibing.
14) Valid control actions properly in packet scheduler, from Davide
Caratti.
15) Even if we get EEXIST, we still need to rehash if a shrink was
delayed. From Herbert Xu.
16) Fix interrupt mask handling in interrupt handler of r8169, from
Heiner Kallweit.
17) Fix leak in ehea driver, from Wen Yang"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (168 commits)
dpaa2-eth: fix race condition with bql frame accounting
chelsio: use BUG() instead of BUG_ON(1)
net: devlink: skip info_get op call if it is not defined in dumpit
net: phy: bcm54xx: Encode link speed and activity into LEDs
tipc: change to check tipc_own_id to return in tipc_net_stop
net: usb: aqc111: Extend HWID table by QNAP device
net: sched: Kconfig: update reference link for PIE
net: dsa: qca8k: extend slave-bus implementations
net: dsa: qca8k: remove leftover phy accessors
dt-bindings: net: dsa: qca8k: support internal mdio-bus
dt-bindings: net: dsa: qca8k: fix example
net: phy: don't clear BMCR in genphy_soft_reset
bpf, libbpf: clarify bump in libbpf version info
bpf, libbpf: fix version info and add it to shared object
rxrpc: avoid clang -Wuninitialized warning
tipc: tipc clang warning
net: sched: fix cleanup NULL pointer exception in act_mirr
r8169: fix cable re-plugging issue
net: ethernet: ti: fix possible object reference leak
net: ibm: fix possible object reference leak
...
|
|
If we use "crashkernel=Y[@X]" and the start address is above 4G,
the arm64 kdump capture kernel may call memblock_alloc_low() failure
in request_standard_resources(). Replacing memblock_alloc_low() with
memblock_alloc().
[ 0.000000] MEMBLOCK configuration:
[ 0.000000] memory size = 0x0000000040650000 reserved size = 0x0000000004db7f39
[ 0.000000] memory.cnt = 0x6
[ 0.000000] memory[0x0] [0x00000000395f0000-0x000000003968ffff], 0x00000000000a0000 bytes on node 0 flags: 0x4
[ 0.000000] memory[0x1] [0x0000000039730000-0x000000003973ffff], 0x0000000000010000 bytes on node 0 flags: 0x4
[ 0.000000] memory[0x2] [0x0000000039780000-0x000000003986ffff], 0x00000000000f0000 bytes on node 0 flags: 0x4
[ 0.000000] memory[0x3] [0x0000000039890000-0x0000000039d0ffff], 0x0000000000480000 bytes on node 0 flags: 0x4
[ 0.000000] memory[0x4] [0x000000003ed00000-0x000000003ed2ffff], 0x0000000000030000 bytes on node 0 flags: 0x4
[ 0.000000] memory[0x5] [0x0000002040000000-0x000000207fffffff], 0x0000000040000000 bytes on node 0 flags: 0x0
[ 0.000000] reserved.cnt = 0x7
[ 0.000000] reserved[0x0] [0x0000002040080000-0x0000002041c4dfff], 0x0000000001bce000 bytes flags: 0x0
[ 0.000000] reserved[0x1] [0x0000002041c53000-0x0000002042c203f8], 0x0000000000fcd3f9 bytes flags: 0x0
[ 0.000000] reserved[0x2] [0x000000207da00000-0x000000207dbfffff], 0x0000000000200000 bytes flags: 0x0
[ 0.000000] reserved[0x3] [0x000000207ddef000-0x000000207fbfffff], 0x0000000001e11000 bytes flags: 0x0
[ 0.000000] reserved[0x4] [0x000000207fdf2b00-0x000000207fdfc03f], 0x0000000000009540 bytes flags: 0x0
[ 0.000000] reserved[0x5] [0x000000207fdfd000-0x000000207ffff3ff], 0x0000000000202400 bytes flags: 0x0
[ 0.000000] reserved[0x6] [0x000000207ffffe00-0x000000207fffffff], 0x0000000000000200 bytes flags: 0x0
[ 0.000000] Kernel panic - not syncing: request_standard_resources: Failed to allocate 384 bytes
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.1.0-next-20190321+ #4
[ 0.000000] Call trace:
[ 0.000000] dump_backtrace+0x0/0x188
[ 0.000000] show_stack+0x24/0x30
[ 0.000000] dump_stack+0xa8/0xcc
[ 0.000000] panic+0x14c/0x31c
[ 0.000000] setup_arch+0x2b0/0x5e0
[ 0.000000] start_kernel+0x90/0x52c
[ 0.000000] ---[ end Kernel panic - not syncing: request_standard_resources: Failed to allocate 384 bytes ]---
Link: https://www.spinics.net/lists/arm-kernel/msg715293.html
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Since commit
ad67b74d2469 ("printk: hash addresses printed with %p")
at boot "____ptrval____" is printed instead of the trampoline addresses:
Base memory trampoline at [(____ptrval____)] 99000 size 24576
Remove the print as we don't want to leak kernel addresses and this
statement is not needed anymore.
Fixes: ad67b74d2469d9b8 ("printk: hash addresses printed with %p")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190326203046.20787-1-mcroce@redhat.com
|
|
The declarations related to immovable memory handling are out of the
BOOT_COMPRESSED_MISC_H #ifdef scope, wrap them inside.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Chao Fan <fanc.fnst@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190304055546.18566-1-bhe@redhat.com
|
|
The Linux RISC-V 32bit kernel is broken after we moved setup_vm() from
kernel/setup.c to mm/init.c because Linux RISC-V 32bit kernel by default
uses cmodel=medlow which results in a non-position-independent setup_vm().
This patch fixes Linux RISC-V 32bit kernel booting by:
1. Forcing cmodel=medany for mm/init.c
2. Moving remaing MM-related stuff va_pa_offset, pfn_base and
empty_zero_page from kernel/setup.c to mm/init.c
Further, the setup_vm() cannot handle GCC instrumentation for FTRACE so
we disable it for mm/init.c by not using "-pg" compiler flag.
Fixes: 6f1e9e946f0b ("RISC-V: Move setup_vm() to mm/init.c")
Suggested-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
A memory save operation to 8-byte variable in RV32 is divided into
two sw instructions in the put_user macro. The current fixup returns
execution flow to the second sw instead of the one after it.
This patch fixes this fixup code according to the load access part.
Signed-off-by: Alan Kao<alankao@andestech.com>
Cc: Greentime Hu <greentime@andestech.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
In cpu_to_drc_index() in the case when FW_FEATURE_DRC_INFO is absent,
we currently use of_read_property() to obtain the pointer to the array
corresponding to the property "ibm,drc-indexes". The elements of this
array are of type __be32, but are accessed without any conversion to
the OS-endianness, which is buggy on a Little Endian OS.
Fix this by using of_property_read_u32_index() accessor function to
safely read the elements of the array.
Fixes: e83636ac3334 ("pseries/drc-info: Search DRC properties for CPU indexes")
Cc: stable@vger.kernel.org # v4.16+
Reported-by: Pavithra R. Prakash <pavrampu@in.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
[mpe: Make the WARN_ON a WARN_ON_ONCE so it's not retriggerable]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
We must not use legacy clock defines for dts clckctrl clocks as the offsets
will be wrong.
Fixes: 87fc89ced3a7 ("ARM: dts: am335x: Move l4 child devices to probe them with ti-sysc")
Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
Enabling CQE support on Tegra186 Jetson TX2 has introduced a regression
that is causing accesses to the file-system on the eMMC to fail. Errors
such as the following have been observed ...
mmc2: running CQE recovery
mmc2: mmc_select_hs400 failed, error -110
print_req_error: I/O error, dev mmcblk2, sector 8 flags 80700
mmc2: cqhci: CQE failed to exit halt state
For now disable CQE support for Tegra186 until this issue is resolved.
Fixes: dfd3cb6feb73 arm64: tegra: Add CQE Support for SDMMC4
Signed-off-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 5.1:
- Correct phy mode setting of imx6dl-yapp4 board to fix a problem
caused by commit 5ecdd77c61c8 ("net: dsa: qca8k: disable delay
for RGMII mode").
- Add a missing of_node_put call to fix leaked reference detected by
coccinelle in imx51 machine code.
- Fix imx6q cpuidle driver bug which causes that CPU might not wake up
at expected time.
- Increase reset duration of Ethernet phy Micrel KSZ9031RNX to fix
transmission timeouts error seen on imx6qdl-phytec-pfla02 board.
- Correct SPDX License Identifier style for imx6ull-pinfunc-snvs.h.
- Fix 'bus-witdh' typos in imx6qdl-icore-rqs.dtsi.
- Correct pseudo PHY address of switch device for imx6dl-yapp4 board.
- Update PWM driver options in imx defconfig files due to the change
on driver part.
* tag 'imx-fixes-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: imx_v4_v5_defconfig: enable PWM driver
ARM: imx_v6_v7_defconfig: continue compiling the pwm driver
ARM: dts: imx6dl-yapp4: Use correct pseudo PHY address for the switch
ARM: dts: imx6qdl: Fix typo in imx6qdl-icore-rqs.dtsi
ARM: dts: imx6ull: Use the correct style for SPDX License Identifier
ARM: dts: pfla02: increase phy reset duration
ARM: imx6q: cpuidle: fix bug that CPU might not wake up at expected time
ARM: imx51: fix a leaked reference by adding missing of_node_put
ARM: dts: imx6dl-yapp4: Use rgmii-id phy mode on the cpu port
|