Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux
Pull oprofile and dcookies removal from Viresh Kumar:
"Remove oprofile and dcookies support
The 'oprofile' user-space tools don't use the kernel OPROFILE support
any more, and haven't in a long time. User-space has been converted to
the perf interfaces.
The dcookies stuff is only used by the oprofile code. Now that
oprofile's support is getting removed from the kernel, there is no
need for dcookies as well.
Remove kernel's old oprofile and dcookies support"
* tag 'oprofile-removal-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux:
fs: Remove dcookies support
drivers: Remove CONFIG_OPROFILE support
arch: xtensa: Remove CONFIG_OPROFILE support
arch: x86: Remove CONFIG_OPROFILE support
arch: sparc: Remove CONFIG_OPROFILE support
arch: sh: Remove CONFIG_OPROFILE support
arch: s390: Remove CONFIG_OPROFILE support
arch: powerpc: Remove oprofile
arch: powerpc: Stop building and using oprofile
arch: parisc: Remove CONFIG_OPROFILE support
arch: mips: Remove CONFIG_OPROFILE support
arch: microblaze: Remove CONFIG_OPROFILE support
arch: ia64: Remove rest of perfmon support
arch: ia64: Remove CONFIG_OPROFILE support
arch: hexagon: Don't select HAVE_OPROFILE
arch: arc: Remove CONFIG_OPROFILE support
arch: arm: Remove CONFIG_OPROFILE support
arch: alpha: Remove CONFIG_OPROFILE support
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ELF compat updates from Al Viro:
"Sanitizing ELF compat support, especially for triarch architectures:
- X32 handling cleaned up
- MIPS64 uses compat_binfmt_elf.c both for O32 and N32 now
- Kconfig side of things regularized
Eventually I hope to have compat_binfmt_elf.c killed, with both native
and compat built from fs/binfmt_elf.c, with -DELF_BITS={64,32} passed
by kbuild, but that's a separate story - not included here"
* 'work.elf-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
get rid of COMPAT_ELF_EXEC_PAGESIZE
compat_binfmt_elf: don't bother with undef of ELF_ARCH
Kconfig: regularize selection of CONFIG_BINFMT_ELF
mips compat: switch to compat_binfmt_elf.c
mips: don't bother with ELF_CORE_EFLAGS
mips compat: don't bother with ELF_ET_DYN_BASE
mips: KVM_GUEST makes no sense for 64bit builds...
mips: kill unused definitions in binfmt_elf[on]32.c
mips binfmt_elf*32.c: use elfcore-compat.h
x32: make X32, !IA32_EMULATION setups able to execute x32 binaries
[amd64] clean PRSTATUS_SIZE/SET_PR_FPVALID up properly
elf_prstatus: collect the common part (everything before pr_reg) into a struct
binfmt_elf: partially sanitize PRSTATUS_SIZE and SET_PR_FPVALID
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Borislav Petkov:
"Annotate new MMIO-accessing insn wrappers' arguments with __iomem"
* tag 'x86_asm_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm: Add a missing __iomem annotation in enqcmds()
x86/asm: Annotate movdir64b()'s dst argument with __iomem
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build updates from Borislav Petkov:
- Treat R_386_PLT32 relocations like R_386_PC32 ones when building
- Add documentation about "make kvm_guest/xen.config" in "make help"
output
* tag 'x86_build_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build: Treat R_386_PLT32 relocation as R_386_PC32
x86/build: Realign archhelp
x86/build: Add {kvm_guest,xen}.config targets to make help's output
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 resource control updates from Borislav Petkov:
"Avoid IPI-ing a task in certain cases and prevent load/store tearing
when accessing a task's resctrl fields concurrently"
* tag 'x86_cache_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Apply READ_ONCE/WRITE_ONCE to task_struct.{rmid,closid}
x86/resctrl: Use task_curr() instead of task_struct->on_cpu to prevent unnecessary IPI
x86/resctrl: Add printf attribute to log function
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 CPUID cleanup from Borislav Petkov:
"Assign a dedicated feature word to a CPUID leaf which is widely used"
* tag 'x86_cpu_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpufeatures: Assign dedicated feature word for CPUID_0x8000001F[EAX]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 FPU updates from Borislav Petkov:
"x86 fpu usage optimization and cleanups:
- make 64-bit kernel code which uses 387 insns request a x87 init
(FNINIT) explicitly when using the FPU
- misc cleanups"
* tag 'x86_fpu_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu/xstate: Use sizeof() instead of a constant
x86/fpu/64: Don't FNINIT in kernel_fpu_begin()
x86/fpu: Make the EFI FPU calling convention explicit
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode cleanup from Borislav Petkov:
"Make the driver init function static again"
* tag 'x86_microcode_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode: Make microcode_init() static
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 misc updates from Borislav Petkov:
- Complete the MSR write filtering by applying it to the MSR ioctl
interface too.
- Other misc small fixups.
* tag 'x86_misc_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/MSR: Filter MSR writes through X86_IOC_WRMSR_REGS ioctl too
selftests/fpu: Fix debugfs_simple_attr.cocci warning
selftests/x86: Use __builtin_ia32_read/writeeflags
x86/reboot: Add Zotac ZBOX CI327 nano PCI reboot quirk
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm cleanups from Borislav Petkov:
- PTRACE_GETREGS/PTRACE_PUTREGS regset selection cleanup
- Another initial cleanup - more to follow - to the fault handling
code.
- Other minor cleanups and corrections.
* tag 'x86_mm_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
x86/{fault,efi}: Fix and rename efi_recover_from_page_fault()
x86/fault: Don't run fixups for SMAP violations
x86/fault: Don't look for extable entries for SMEP violations
x86/fault: Rename no_context() to kernelmode_fixup_or_oops()
x86/fault: Bypass no_context() for implicit kernel faults from usermode
x86/fault: Split the OOPS code out from no_context()
x86/fault: Improve kernel-executing-user-memory handling
x86/fault: Correct a few user vs kernel checks wrt WRUSS
x86/fault: Document the locking in the fault_signal_pending() path
x86/fault/32: Move is_f00f_bug() to do_kern_addr_fault()
x86/fault: Fold mm_fault_error() into do_user_addr_fault()
x86/fault: Skip the AMD erratum #91 workaround on unaffected CPUs
x86/fault: Fix AMD erratum #91 errata fixup for user code
x86/Kconfig: Remove HPET_EMULATE_RTC depends on RTC
x86/asm: Fixup TASK_SIZE_MAX comment
x86/ptrace: Clean up PTRACE_GETREGS/PTRACE_PUTREGS regset selection
x86/vm86/32: Remove VM86_SCREEN_BITMAP support
x86: Remove definition of DEBUG
x86/entry: Remove now unused do_IRQ() declaration
x86/mm: Remove duplicate definition of _PAGE_PAT_LARGE
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 paravirt updates from Borislav Petkov:
"Part one of a major conversion of the paravirt infrastructure to our
kernel patching facilities and getting rid of the custom-grown ones"
* tag 'x86_paravirt_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pv: Rework arch_local_irq_restore() to not use popf
x86/xen: Drop USERGS_SYSRET64 paravirt call
x86/pv: Switch SWAPGS to ALTERNATIVE
x86/xen: Use specific Xen pv interrupt entry for DF
x86/xen: Use specific Xen pv interrupt entry for MCE
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Borislav Petkov:
- Convert geode drivers to look up the LED controls from a GPIO machine
descriptor table.
- Remove arch/x86/platform/goldfish as it is not used by the android
emulator anymore.
* tag 'x86_platform_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform/geode: Convert alix LED to GPIO machine descriptor
x86/platform/geode: Convert geode LED to GPIO machine descriptor
x86/platform/geode: Convert net5501 LED to GPIO machine descriptor
x86/platform: Retire arch/x86/platform/goldfish
x86/platform/intel-mid: Convert comma to semicolon
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV-ES fix from Borislav Petkov:
"Do not unroll string I/O for SEV-ES guests because they support it"
* tag 'x86_seves_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev-es: Do not unroll string I/O for SEV-ES guests
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SGX fixes from Borislav Petkov:
"Random small fixes which missed the initial SGX submission. Also, some
procedural clarifications"
* tag 'x86_sgx_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Add Dave Hansen as reviewer for INTEL SGX
x86/sgx: Drop racy follow_pfn() check
MAINTAINERS: Fix the tree location for INTEL SGX patches
x86/sgx: Fix the return type of sgx_init()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ard Biesheuvel via Borislav Petkov:
"A few cleanups left and right, some of which were part of a initrd
measured boot series that needs some more work, and so only the
cleanup patches have been included for this release"
* tag 'efi-next-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/arm64: Update debug prints to reflect other entropy sources
efi: x86: clean up previous struct mm switching
efi: x86: move mixed mode stack PA variable out of 'efi_scratch'
efi/libstub: move TPM related prototypes into efistub.h
efi/libstub: fix prototype of efi_tcg2_protocol::get_event_log()
efi/libstub: whitespace cleanup
efi: ia64: move IA64-only declarations to new asm/efi.h header
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:
- move therm_throt.c to the thermal framework, where it belongs.
- identify CPUs which miss to enter the broadcast handler, as an
additional debugging aid.
* tag 'ras_updates_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
thermal: Move therm_throt there from x86/mce
x86/mce: Get rid of mcheck_intel_therm_init()
x86/mce: Make mce_timed_out() identify holdout CPUs
|
|
Pull networking updates from David Miller:
"Here is what we have this merge window:
1) Support SW steering for mlx5 Connect-X6Dx, from Yevgeny Kliteynik.
2) Add RSS multi group support to octeontx2-pf driver, from Geetha
Sowjanya.
3) Add support for KS8851 PHY. From Marek Vasut.
4) Add support for GarfieldPeak bluetooth controller from Kiran K.
5) Add support for half-duplex tcan4x5x can controllers.
6) Add batch skb rx processing to bcrm63xx_enet, from Sieng Piaw
Liew.
7) Rework RX port offload infrastructure, particularly wrt, UDP
tunneling, from Jakub Kicinski.
8) Add BCM72116 PHY support, from Florian Fainelli.
9) Remove Dsa specific notifiers, they are unnecessary. From Vladimir
Oltean.
10) Add support for picosecond rx delay in dwmac-meson8b chips. From
Martin Blumenstingl.
11) Support TSO on xfrm interfaces from Eyal Birger.
12) Add support for MP_PRIO to mptcp stack, from Geliang Tang.
13) Support BCM4908 integrated switch, from Rafał Miłecki.
14) Support for directly accessing kernel module variables via module
BTF info, from Andrii Naryiko.
15) Add DASH (esktop and mobile Architecture for System Hardware)
support to r8169 driver, from Heiner Kallweit.
16) Add rx vlan filtering to dpaa2-eth, from Ionut-robert Aron.
17) Add support for 100 base0x SFP devices, from Bjarni Jonasson.
18) Support link aggregation in DSA, from Tobias Waldekranz.
19) Support for bitwidse atomics in bpf, from Brendan Jackman.
20) SmartEEE support in at803x driver, from Russell King.
21) Add support for flow based tunneling to GTP, from Pravin B Shelar.
22) Allow arbitrary number of interconnrcts in ipa, from Alex Elder.
23) TLS RX offload for bonding, from Tariq Toukan.
24) RX decap offklload support in mac80211, from Felix Fietkou.
25) devlink health saupport in octeontx2-af, from George Cherian.
26) Add TTL attr to SCM_TIMESTAMP_OPT_STATS, from Yousuk Seung
27) Delegated actionss support in mptcp, from Paolo Abeni.
28) Support receive timestamping when doin zerocopy tcp receive. From
Arjun Ray.
29) HTB offload support for mlx5, from Maxim Mikityanskiy.
30) UDP GRO forwarding, from Maxim Mikityanskiy.
31) TAPRIO offloading in dsa hellcreek driver, from Kurt Kanzenbach.
32) Weighted random twos choice algorithm for ipvs, from Darby Payne.
33) Fix netdev registration deadlock, from Johannes Berg.
34) Various conversions to new tasklet api, from EmilRenner Berthing.
35) Bulk skb allocations in veth, from Lorenzo Bianconi.
36) New ethtool interface for lane setting, from Danielle Ratson.
37) Offload failiure notifications for routes, from Amit Cohen.
38) BCM4908 support, from Rafał Miłecki.
39) Support several new iwlwifi chips, from Ihab Zhaika.
40) Flow drector support for ipv6 in i40e, from Przemyslaw Patynowski.
41) Support for mhi prrotocols, from Loic Poulain.
42) Optimize bpf program stats.
43) Implement RFC6056, for better port randomization, from Eric
Dumazet.
44) hsr tag offloading support from George McCollister.
45) Netpoll support in qede, from Bhaskar Upadhaya.
46) 2005/400g speed support in bonding 3ad mode, from Nikolay
Aleksandrov.
47) Netlink event support in mptcp, from Florian Westphal.
48) Better skbuff caching, from Alexander Lobakin.
49) MRP (Media Redundancy Protocol) offloading in DSA and a few
drivers, from Horatiu Vultur.
50) mqprio saupport in mvneta, from Maxime Chevallier.
51) Remove of_phy_attach, no longer needed, from Florian Fainelli"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1766 commits)
octeontx2-pf: Fix otx2_get_fecparam()
cteontx2-pf: cn10k: Prevent harmless double shift bugs
net: stmmac: Add PCI bus info to ethtool driver query output
ptp: ptp_clockmatrix: clean-up - parenthesis around a == b are unnecessary
ptp: ptp_clockmatrix: Simplify code - remove unnecessary `err` variable.
ptp: ptp_clockmatrix: Coding style - tighten vertical spacing.
ptp: ptp_clockmatrix: Clean-up dev_*() messages.
ptp: ptp_clockmatrix: Remove unused header declarations.
ptp: ptp_clockmatrix: Add alignment of 1 PPS to idtcm_perout_enable.
ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.
net: stmmac: dwmac-sun8i: Add a shutdown callback
net: stmmac: dwmac-sun8i: Minor probe function cleanup
net: stmmac: dwmac-sun8i: Use reset_control_reset
net: stmmac: dwmac-sun8i: Remove unnecessary PHY power check
net: stmmac: dwmac-sun8i: Return void from PHY unpower
r8169: use macro pm_ptr
net: mdio: Remove of_phy_attach()
net: mscc: ocelot: select PACKING in the Kconfig
net: re-solve some conflicts after net -> net-next merge
net: dsa: tag_rtl4_a: Support also egress tags
...
|
|
Remove several exports from the MMU that are no longer necessary.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-15-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Drop kvm_mmu_slot_largepage_remove_write_access() and refactor its sole
caller to use kvm_mmu_slot_remove_write_access(). Remove the now-unused
slot_handle_large_level() and slot_handle_all_level() helpers.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-14-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Stop setting dirty bits for MMU pages when dirty logging is disabled for
a memslot, as PML is now completely disabled when there are no memslots
with dirty logging enabled.
This means that spurious PML entries will be created for memslots with
dirty logging disabled if at least one other memslot has dirty logging
enabled. However, spurious PML entries are already possible since
dirty bits are set only when a dirty logging is turned off, i.e. memslots
that are never dirty logged will have dirty bits cleared.
In the end, it's faster overall to eat a few spurious PML entries in the
window where dirty logging is being disabled across all memslots.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-13-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Currently, if enable_pml=1 PML remains enabled for the entire lifetime
of the VM irrespective of whether dirty logging is enable or disabled.
When dirty logging is disabled, all the pages of the VM are manually
marked dirty, so that PML is effectively non-operational. Setting
the dirty bits is an expensive operation which can cause severe MMU
lock contention in a performance sensitive path when dirty logging is
disabled after a failed or canceled live migration.
Manually setting dirty bits also fails to prevent PML activity if some
code path clears dirty bits, which can incur unnecessary VM-Exits.
In order to avoid this extra overhead, dynamically enable/disable PML
when dirty logging gets turned on/off for the first/last memslot.
Signed-off-by: Makarand Sonare <makarandsonare@google.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-12-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add a sanity check in kvm_mmu_slot_apply_flags to assert that the
LOG_DIRTY_PAGES flag is indeed being toggled, and explicitly rely on
that holding true when zapping collapsible SPTEs. Manipulating the
CPU dirty log (PML) and write-protection also relies on this assertion,
but that's not obvious in the current code.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-11-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Drop the facade of KVM's PML logic being vendor specific and move the
bits that aren't truly VMX specific into common x86 code. The MMU logic
for dealing with PML is tightly coupled to the feature and to VMX's
implementation, bouncing through kvm_x86_ops obfuscates the code without
providing any meaningful separation of concerns or encapsulation.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-10-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Store the vendor-specific dirty log size in a variable, there's no need
to wrap it in a function since the value is constant after
hardware_setup() runs.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-9-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Expand the comment about need to use write-protection for nested EPT
when PML is enabled to clarify that the tagging is a nop when PML is
_not_ enabled. Without the clarification, omitting the PML check looks
wrong at first^Wfifth glance.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Unconditionally disable PML in vmcs02, KVM emulates PML purely in the
MMU, e.g. vmx_flush_pml_buffer() doesn't even try to copy the L2 GPAs
from vmcs02's buffer to vmcs12. At best, enabling PML is a nop. At
worst, it will cause vmx_flush_pml_buffer() to record bogus GFNs in the
dirty logs.
Initialize vmcs02.GUEST_PML_INDEX such that PML writes would trigger
VM-Exit if PML was somehow enabled, skip flushing the buffer for guest
mode since the index is bogus, and freak out if a PML full exit occurs
when L2 is active.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-7-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When zapping SPTEs in order to rebuild them as huge pages, use the new
helper that computes the max mapping level to detect whether or not a
SPTE should be zapped. Doing so avoids zapping SPTEs that can't
possibly be rebuilt as huge pages, e.g. due to hardware constraints,
memslot alignment, etc...
This also avoids zapping SPTEs that are still large, e.g. if migration
was canceled before write-protected huge pages were shattered to enable
dirty logging. Note, such pages are still write-protected at this time,
i.e. a page fault VM-Exit will still occur. This will hopefully be
addressed in a future patch.
Sadly, TDP MMU loses its const on the memslot, but that's a pervasive
problem that's been around for quite some time.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Pass the memslot to the rmap callbacks, it will be used when zapping
collapsible SPTEs to verify the memslot is compatible with hugepages
before zapping its SPTEs.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Factor out the logic for determining the maximum mapping level given a
memslot and a gpa. The helper will be used when zapping collapsible
SPTEs when disabling dirty logging, e.g. to avoid zapping SPTEs that
can't possibly be rebuilt as hugepages.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
HugeTLB pages
Zap SPTEs that are backed by ZONE_DEVICE pages when zappings SPTEs to
rebuild them as huge pages in the TDP MMU. ZONE_DEVICE huge pages are
managed differently than "regular" pages and are not compound pages.
Likewise, PageTransCompoundMap() will not detect HugeTLB, so switch
to PageCompound().
This matches the similar check in kvm_mmu_zap_collapsible_spte.
Cc: Ben Gardon <bgardon@google.com>
Fixes: 14881998566d ("kvm: x86/mmu: Support disabling dirty logging for the tdp MMU")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This is not needed because the tweak was done on the guest_mmu, while
nested_ept_uninit_mmu_context has just changed vcpu->arch.walk_mmu
back to the root_mmu.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In case of npt=0 on host, nSVM needs the same .inject_page_fault tweak
as VMX has, to make sure that shadow mmu faults are injected as vmexits.
It is not clear why this is needed at all, but for now keep the same
code as VMX and we'll fix it for both.
Based on a patch by Maxim Levitsky <mlevitsk@redhat.com>.
Fixes: 7c86663b68ba ("KVM: nSVM: inject exceptions via svm_check_nested_events")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This way trace will capture all the nested mode entries
(including entries after migration, and from smm)
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210217145718.1217358-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
trace_kvm_exit prints this value (using vmx_get_exit_info)
so it makes sense to read it before the trace point.
Fixes: dcf068da7eb2 ("KVM: VMX: Introduce generic fastpath handler")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210217145718.1217358-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Remove the restriction that prevents VMX from exposing INVPCID to the
guest without PCID also being exposed to the guest. The justification of
the restriction is that INVPCID will #UD if it's disabled in the VMCS.
While that is a true statement, it's also true that RDTSCP will #UD if
it's disabled in the VMCS. Neither of those things has any dependency
whatsoever on the guest being able to set CR4.PCIDE=1, which is what is
effectively allowed by exposing PCID to the guest.
Removing the bogus restriction aligns VMX with SVM, and also allows for
an interesting configuration. INVPCID is that fastest way to do a global
TLB flush, e.g. see native_flush_tlb_global(). Allowing INVPCID without
PCID would let a guest use the expedited flush while also limiting the
number of ASIDs consumed by the guest.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210212003411.1102677-4-seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Advertise INVPCID by default (if supported by the host kernel) instead
of having both SVM and VMX opt in. INVPCID was opt in when it was a
VMX only feature so that KVM wouldn't prematurely advertise support
if/when it showed up in the kernel on AMD hardware.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210212003411.1102677-3-seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Intercept INVPCID if it's disabled in the guest, even when using NPT,
as KVM needs to inject #UD in this case.
Fixes: 4407a797e941 ("KVM: SVM: Enable INVPCID feature on AMD")
Cc: Babu Moger <babu.moger@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210212003411.1102677-2-seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Allow building x86 with PREEMPT_DYNAMIC=n, this is needed for
PREEMPT_RT as it makes no sense to not have full preemption on
PREEMPT_RT.
Fixes: 8c98e8cf723c ("preempt/dynamic: Provide preempt_schedule[_notrace]() static calls")
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Mike Galbraith <efault@gmx.de>
Link: https://lkml.kernel.org/r/YCK1+JyFNxQnWeXK@hirez.programming.kicks-ass.net
|
|
Following the idle loop model, cleanly check for pending rcuog wakeup
before the last rescheduling point upon resuming to guest mode. This
way we can avoid to do it from rcu_user_enter() with the last resort
self-IPI hack that enforces rescheduling.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210131230548.32970-6-frederic@kernel.org
|
|
Use the new EXPORT_STATIC_CALL_TRAMP() / static_call_mod() to unexport
the static_call_key for the PREEMPT_DYNAMIC calls such that modules
can no longer update these calls.
Having modules change/hi-jack the preemption calls would be horrible.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When exporting static_call_key; with EXPORT_STATIC_CALL*(), the module
can use static_call_update() to change the function called. This is
not desirable in general.
Not exporting static_call_key however also disallows usage of
static_call(), since objtool needs the key to construct the
static_call_site.
Solve this by allowing objtool to create the static_call_site using
the trampoline address when it builds a module and cannot find the
static_call_key symbol. The module loader will then try and map the
trampole back to a key before it constructs the normal sites list.
Doing this requires a trampoline -> key associsation, so add another
magic section that keeps those.
Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210127231837.ifddpn7rhwdaepiu@treble
|
|
Provide static calls to control preempt_schedule[_notrace]()
(called in CONFIG_PREEMPT) so that we can override their behaviour when
preempt= is overriden.
Since the default behaviour is full preemption, both their calls are
initialized to the arch provided wrapper, if any.
[fweisbec: only define static calls when PREEMPT_DYNAMIC, make it less
dependent on x86 with __preempt_schedule_func]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-7-frederic@kernel.org
|
|
Preemption mode selection is currently hardcoded on Kconfig choices.
Introduce a dedicated option to tune preemption flavour at boot time,
This will be only available on architectures efficiently supporting
static calls in order not to tempt with the feature against additional
overhead that might be prohibitive or undesirable.
CONFIG_PREEMPT_DYNAMIC is automatically selected by CONFIG_PREEMPT if
the architecture provides the necessary support (CONFIG_STATIC_CALL_INLINE,
CONFIG_GENERIC_ENTRY, and provide with __preempt_schedule_function() /
__preempt_schedule_notrace_function()).
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
[peterz: relax requirement to HAVE_STATIC_CALL]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-5-frederic@kernel.org
|
|
Provide a stub function that return 0 and wire up the static call site
patching to replace the CALL with a single 5 byte instruction that
clears %RAX, the return value register.
The function can be cast to any function pointer type that has a
single %RAX return (including pointers). Also provide a version that
returns an int for convenience. We are clearing the entire %RAX register
in any case, whether the return value is 32 or 64 bits, since %RAX is
always a scratch register anyway.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-2-frederic@kernel.org
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-02-16
The following pull-request contains BPF updates for your *net-next* tree.
There's a small merge conflict between 7eeba1706eba ("tcp: Add receive timestamp
support for receive zerocopy.") from net-next tree and 9cacf81f8161 ("bpf: Remove
extra lock_sock for TCP_ZEROCOPY_RECEIVE") from bpf-next tree. Resolve as follows:
[...]
lock_sock(sk);
err = tcp_zerocopy_receive(sk, &zc, &tss);
err = BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sk, level, optname,
&zc, &len, err);
release_sock(sk);
[...]
We've added 116 non-merge commits during the last 27 day(s) which contain
a total of 156 files changed, 5662 insertions(+), 1489 deletions(-).
The main changes are:
1) Adds support of pointers to types with known size among global function
args to overcome the limit on max # of allowed args, from Dmitrii Banshchikov.
2) Add bpf_iter for task_vma which can be used to generate information similar
to /proc/pid/maps, from Song Liu.
3) Enable bpf_{g,s}etsockopt() from all sock_addr related program hooks. Allow
rewriting bind user ports from BPF side below the ip_unprivileged_port_start
range, both from Stanislav Fomichev.
4) Prevent recursion on fentry/fexit & sleepable programs and allow map-in-map
as well as per-cpu maps for the latter, from Alexei Starovoitov.
5) Add selftest script to run BPF CI locally. Also enable BPF ringbuffer
for sleepable programs, both from KP Singh.
6) Extend verifier to enable variable offset read/write access to the BPF
program stack, from Andrei Matei.
7) Improve tc & XDP MTU handling and add a new bpf_check_mtu() helper to
query device MTU from programs, from Jesper Dangaard Brouer.
8) Allow bpf_get_socket_cookie() helper also be called from [sleepable] BPF
tracing programs, from Florent Revest.
9) Extend x86 JIT to pad JMPs with NOPs for helping image to converge when
otherwise too many passes are required, from Gary Lin.
10) Verifier fixes on atomics with BPF_FETCH as well as function-by-function
verification both related to zero-extension handling, from Ilya Leoshkevich.
11) Better kernel build integration of resolve_btfids tool, from Jiri Olsa.
12) Batch of AF_XDP selftest cleanups and small performance improvement
for libbpf's xsk map redirect for newer kernels, from Björn Töpel.
13) Follow-up BPF doc and verifier improvements around atomics with
BPF_FETCH, from Brendan Jackman.
14) Permit zero-sized data sections e.g. if ELF .rodata section contains
read-only data from local variables, from Yonghong Song.
15) veth driver skb bulk-allocation for ndo_xdp_xmit, from Lorenzo Bianconi.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Update Copyright year and drop file names from files themselves.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
After the commit f1be6cdaf57c ("x86/platform/intel-mid: Make
intel_scu_device_register() static") the platform_device.h is not being
used anymore by intel-mid.h. Remove it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Since there is no more user of this global variable and associated custom API,
we may safely drop this legacy reinvented a wheel from the kernel sources.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The header is used by a single user. Move header content to that user.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|