summaryrefslogtreecommitdiffstats
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2022-05-23Merge tag 'x86_build_for_v5.19_rc1' of ↵Linus Torvalds3-13/+26
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build updates from Borislav Petkov: - Add a "make x86_debug.config" target which enables a bunch of useful config debug options when trying to debug an issue - A gcc-12 build warnings fix * tag 'x86_build_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Wrap literal addresses in absolute_pointer() x86/configs: Add x86 debugging Kconfig fragment plus docs
2022-05-23Merge tag 'x86_asm_for_v5.19_rc1' of ↵Linus Torvalds7-166/+74
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Borislav Petkov: - A bunch of changes towards streamlining low level asm helpers' calling conventions so that former can be converted to C eventually - Simplify PUSH_AND_CLEAR_REGS so that it can be used at the system call entry paths instead of having opencoded, slightly different variants of it everywhere - Misc other fixes * tag 'x86_asm_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry: Fix register corruption in compat syscall objtool: Fix STACK_FRAME_NON_STANDARD reloc type linkage: Fix issue with missing symbol size x86/entry: Remove skip_r11rcx x86/entry: Use PUSH_AND_CLEAR_REGS for compat x86/entry: Simplify entry_INT80_compat() x86/mm: Simplify RESERVE_BRK() x86/entry: Convert SWAPGS to swapgs and remove the definition of SWAPGS x86/entry: Don't call error_entry() for XENPV x86/entry: Move CLD to the start of the idtentry macro x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry() x86/entry: Switch the stack after error_entry() returns x86/traps: Use pt_regs directly in fixup_bad_iret()
2022-05-23Merge tag 'x86_cpu_for_v5.19_rc1' of ↵Linus Torvalds13-166/+101
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CPU feature updates from Borislav Petkov: - Remove a bunch of chicken bit options to turn off CPU features which are not really needed anymore - Misc fixes and cleanups * tag 'x86_cpu_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation: Add missing prototype for unpriv_ebpf_notify() x86/pm: Fix false positive kmemleak report in msr_build_context() x86/speculation/srbds: Do not try to turn mitigation off when not supported x86/cpu: Remove "noclflush" x86/cpu: Remove "noexec" x86/cpu: Remove "nosmep" x86/cpu: Remove CONFIG_X86_SMAP and "nosmap" x86/cpu: Remove "nosep" x86/cpu: Allow feature bit names from /proc/cpuinfo in clearcpuid=
2022-05-23Merge tag 'x86_tdx_for_v5.19_rc1' of ↵Linus Torvalds49-120/+1835
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Intel TDX support from Borislav Petkov: "Intel Trust Domain Extensions (TDX) support. This is the Intel version of a confidential computing solution called Trust Domain Extensions (TDX). This series adds support to run the kernel as part of a TDX guest. It provides similar guest protections to AMD's SEV-SNP like guest memory and register state encryption, memory integrity protection and a lot more. Design-wise, it differs from AMD's solution considerably: it uses a software module which runs in a special CPU mode called (Secure Arbitration Mode) SEAM. As the name suggests, this module serves as sort of an arbiter which the confidential guest calls for services it needs during its lifetime. Just like AMD's SNP set, this series reworks and streamlines certain parts of x86 arch code so that this feature can be properly accomodated" * tag 'x86_tdx_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) x86/tdx: Fix RETs in TDX asm x86/tdx: Annotate a noreturn function x86/mm: Fix spacing within memory encryption features message x86/kaslr: Fix build warning in KASLR code in boot stub Documentation/x86: Document TDX kernel architecture ACPICA: Avoid cache flush inside virtual machines x86/tdx/ioapic: Add shared bit for IOAPIC base address x86/mm: Make DMA memory shared for TD guest x86/mm/cpa: Add support for TDX shared memory x86/tdx: Make pages shared in ioremap() x86/topology: Disable CPU online/offline control for TDX guests x86/boot: Avoid #VE during boot for TDX platforms x86/boot: Set CR0.NE early and keep it set during the boot x86/acpi/x86/boot: Add multiprocessor wake-up support x86/boot: Add a trampoline for booting APs via firmware handoff x86/tdx: Wire up KVM hypercalls x86/tdx: Port I/O: Add early boot support x86/tdx: Port I/O: Add runtime hypercalls x86/boot: Port I/O: Add decompression-time support for TDX x86/boot: Port I/O: Allow to hook up alternative helpers ...
2022-05-23Merge tag 'ras_core_for_v5.19_rc1' of ↵Linus Torvalds3-79/+67
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS updates from Borislav Petkov: - Simplification of the AMD MCE error severity grading logic along with supplying critical panic MCEs with accompanying error messages for more human-friendly diagnostics. - Misc fixes * tag 'ras_core_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Add messages for panic errors in AMD's MCE grading x86/mce: Simplify AMD severity grading logic x86/MCE/AMD: Fix memory leak when threshold_create_bank() fails x86/mce: Avoid unnecessary padding in struct mce_bank
2022-05-23Merge tag 'x86_sev_for_v5.19_rc1' of ↵Linus Torvalds48-392/+2772
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull AMD SEV-SNP support from Borislav Petkov: "The third AMD confidential computing feature called Secure Nested Paging. Add to confidential guests the necessary memory integrity protection against malicious hypervisor-based attacks like data replay, memory remapping and others, thus achieving a stronger isolation from the hypervisor. At the core of the functionality is a new structure called a reverse map table (RMP) with which the guest has a say in which pages get assigned to it and gets notified when a page which it owns, gets accessed/modified under the covers so that the guest can take an appropriate action. In addition, add support for the whole machinery needed to launch a SNP guest, details of which is properly explained in each patch. And last but not least, the series refactors and improves parts of the previous SEV support so that the new code is accomodated properly and not just bolted on" * tag 'x86_sev_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) x86/entry: Fixup objtool/ibt validation x86/sev: Mark the code returning to user space as syscall gap x86/sev: Annotate stack change in the #VC handler x86/sev: Remove duplicated assignment to variable info x86/sev: Fix address space sparse warning x86/sev: Get the AP jump table address from secrets page x86/sev: Add missing __init annotations to SEV init routines virt: sevguest: Rename the sevguest dir and files to sev-guest virt: sevguest: Change driver name to reflect generic SEV support x86/boot: Put globals that are accessed early into the .data section x86/boot: Add an efi.h header for the decompressor virt: sevguest: Fix bool function returning negative value virt: sevguest: Fix return value check in alloc_shared_pages() x86/sev-es: Replace open-coded hlt-loop with sev_es_terminate() virt: sevguest: Add documentation for SEV-SNP CPUID Enforcement virt: sevguest: Add support to get extended report virt: sevguest: Add support to derive key virt: Add SEV-SNP guest driver x86/sev: Register SEV-SNP guest request platform device x86/sev: Provide support for SNP guest request NAEs ...
2022-05-23Merge tag 'x86-irq-2022-05-23' of ↵Linus Torvalds2-64/+322
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PCI irq routing updates from Thomas Gleixner: - Cleanup and robustify the PCI interrupt routing table handling including proper range checks - Add support for Intel 82378ZB/82379AB, SiS85C497 PIRQ routers - Fix the ALi M1487 router handling - Handle the IRT routing table format in AMI BIOSes correctly * tag 'x86-irq-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/PCI: Fix coding style in PIRQ table verification x86/PCI: Fix ALi M1487 (IBC) PIRQ router link value interpretation x86/PCI: Add $IRT PIRQ routing table support x86/PCI: Handle PIRQ routing tables with no router device given x86/PCI: Add PIRQ routing table range checks x86/PCI: Add support for the SiS85C497 PIRQ router x86/PCI: Disambiguate SiS85C503 PIRQ router code entities x86/PCI: Handle IRQ swizzling with PIRQ routers x86/PCI: Also match function number in $PIR table x86/PCI: Include function number in $PIR table dump x86/PCI: Show the physical address of the $PIR table
2022-05-23Merge tag 'irq-core-2022-05-23' of ↵Linus Torvalds4-18/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull interrupt handling updates from Thomas Gleixner: "Core code: - Make the managed interrupts more robust by shutting them down in the core code when the assigned affinity mask does not contain online CPUs. - Make the irq simulator chip work on RT - A small set of cpumask and power manageent cleanups Drivers: - A set of changes which mark GPIO interrupt chips immutable to prevent the GPIO subsystem from modifying it under the hood. This provides the necessary infrastructure and converts a set of GPIO and pinctrl drivers over. - A set of changes to make the pseudo-NMI handling for GICv3 more robust: a missing barrier and consistent handling of the priority mask. - Another set of GICv3 improvements and fixes, but nothing outstanding - The usual set of improvements and cleanups all over the place - No new irqchip drivers and not even a new device tree binding! 100+ interrupt chips are truly enough" * tag 'irq-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) irqchip: Add Kconfig symbols for sunxi drivers irqchip/gic-v3: Fix priority mask handling irqchip/gic-v3: Refactor ISB + EOIR at ack time irqchip/gic-v3: Ensure pseudo-NMIs have an ISB between ack and handling genirq/irq_sim: Make the irq_work always run in hard irq context irqchip/armada-370-xp: Do not touch Performance Counter Overflow on A375, A38x, A39x irqchip/gic: Improved warning about incorrect type irqchip/csky: Return true/false (not 1/0) from bool functions irqchip/imx-irqsteer: Add runtime PM support irqchip/imx-irqsteer: Constify irq_chip struct irqchip/armada-370-xp: Enable MSI affinity configuration irqchip/aspeed-scu-ic: Fix irq_of_parse_and_map() return value irqchip/aspeed-i2c-ic: Fix irq_of_parse_and_map() return value irqchip/sun6i-r: Use NULL for chip_data irqchip/xtensa-mx: Fix initial IRQ affinity in non-SMP setup irqchip/exiu: Fix acknowledgment of edge triggered interrupts irqchip/gic-v3: Claim iomem resources dt-bindings: interrupt-controller: arm,gic-v3: Make the v2 compat requirements explicit irqchip/gic-v3: Relax polling of GIC{R,D}_CTLR.RWP irqchip/gic-v3: Detect LPI invalidation MMIO registers ...
2022-05-23Merge tag 'smp-core-2022-05-23' of ↵Linus Torvalds1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull CPU hotplug updates from Thomas Gleixner: - Initialize the per-CPU structures during early boot so that the state is consistent from the very beginning. - Make the virtualization hotplug state handling more robust and let the core bringup CPUs which timed out in an earlier attempt again. - Make the x86/xen CPU state tracking consistent on a failed online attempt, so a consecutive bringup does not fall over the inconsistent state. * tag 'smp-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Initialise all cpuhp_cpu_state structs earlier cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again. x86/xen: Allow to retry if cpu_initialize_context() failed.
2022-05-23Merge tag 'for-5.19/drivers-2022-05-22' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+0
Pull block driver updates from Jens Axboe: "Here are the driver updates queued up for 5.19. This contains: - NVMe pull requests via Christoph: - tighten the PCI presence check (Stefan Roese) - fix a potential NULL pointer dereference in an error path (Kyle Miller Smith) - fix interpretation of the DMRSL field (Tom Yan) - relax the data transfer alignment (Keith Busch) - verbose error logging improvements (Max Gurtovoy, Chaitanya Kulkarni) - misc cleanups (Chaitanya Kulkarni, Christoph) - set non-mdts limits in nvme_scan_work (Chaitanya Kulkarni) - add support for TP4084 - Time-to-Ready Enhancements (Christoph) - MD pull request via Song: - Improve annotation in raid5 code, by Logan Gunthorpe - Support MD_BROKEN flag in raid-1/5/10, by Mariusz Tkaczyk - Other small fixes/cleanups - null_blk series making the configfs side much saner (Damien) - Various minor drbd cleanups and fixes (Haowen, Uladzislau, Jiapeng, Arnd, Cai) - Avoid using the system workqueue (and hence flushing it) in rnbd (Jack) - Avoid using the system workqueue (and hence flushing it) in aoe (Tetsuo) - Series fixing discard_alignment issues in drivers (Christoph) - Small series fixing drivers poking at disk->part0 for openers information (Christoph) - Series fixing deadlocks in loop (Christoph, Tetsuo) - Remove loop.h and add SPDX headers (Christoph) - Various fixes and cleanups (Julia, Xie, Yu)" * tag 'for-5.19/drivers-2022-05-22' of git://git.kernel.dk/linux-block: (72 commits) mtip32xx: fix typo in comment nvme: set non-mdts limits in nvme_scan_work nvme: add support for TP4084 - Time-to-Ready Enhancements nvme: split the enum used for various register constants nbd: Fix hung on disconnect request if socket is closed before nvme-fabrics: add a request timeout helper nvme-pci: harden drive presence detect in nvme_dev_disable() nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tags nvme: mark internal passthru request RQF_QUIET nvme: remove unneeded include from constants file nvme: add missing status values to verbose logging nvme: set dma alignment to dword nvme: fix interpretation of DMRSL loop: remove most the top-of-file boilerplate comment from the UAPI header loop: remove most the top-of-file boilerplate comment loop: add a SPDX header loop: remove loop.h block: null_blk: Improve device creation with configfs block: null_blk: Cleanup messages block: null_blk: Cleanup device creation and deletion ...
2022-05-23Merge tag 'for-5.19/block-2022-05-22' of git://git.kernel.dk/linux-blockLinus Torvalds1-2/+0
Pull block updates from Jens Axboe: "Here are the core block changes for 5.19. This contains: - blk-throttle accounting fix (Laibin) - Series removing redundant assignments (Michal) - Expose bio cache via the bio_set, so that DM can use it (Mike) - Finish off the bio allocation interface cleanups by dealing with the weirdest member of the family. bio_kmalloc combines a kmalloc for the bio and bio_vecs with a hidden bio_init call and magic cleanup semantics (Christoph) - Clean up the block layer API so that APIs consumed by file systems are (almost) only struct block_device based, so that file systems don't have to poke into block layer internals like the request_queue (Christoph) - Clean up the blk_execute_rq* API (Christoph) - Clean up various lose end in the blk-cgroup code to make it easier to follow in preparation of reworking the blkcg assignment for bios (Christoph) - Fix use-after-free issues in BFQ when processes with merged queues get moved to different cgroups (Jan) - BFQ fixes (Jan) - Various fixes and cleanups (Bart, Chengming, Fanjun, Julia, Ming, Wolfgang, me)" * tag 'for-5.19/block-2022-05-22' of git://git.kernel.dk/linux-block: (83 commits) blk-mq: fix typo in comment bfq: Remove bfq_requeue_request_body() bfq: Remove superfluous conversion from RQ_BIC() bfq: Allow current waker to defend against a tentative one bfq: Relax waker detection for shared queues blk-cgroup: delete rcu_read_lock_held() WARN_ON_ONCE() blk-throttle: Set BIO_THROTTLED when bio has been throttled blk-cgroup: Remove unnecessary rcu_read_lock/unlock() blk-cgroup: always terminate io.stat lines block, bfq: make bfq_has_work() more accurate block, bfq: protect 'bfqd->queued' by 'bfqd->lock' block: cleanup the VM accounting in submit_bio block: Fix the bio.bi_opf comment block: reorder the REQ_ flags blk-iocost: combine local_stat and desc_stat to stat block: improve the error message from bio_check_eod block: allow passing a NULL bdev to bio_alloc_clone/bio_init_clone block: remove superfluous calls to blkcg_bio_issue_init kthread: unexport kthread_blkcg blk-cgroup: cleanup blkcg_maybe_throttle_current ...
2022-05-23Merge tag 'rcu.2022.05.19a' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU update from Paul McKenney: - Documentation updates - Miscellaneous fixes - Callback-offloading updates, mainly simplifications - RCU-tasks updates, including some -rt fixups, handling of systems with sparse CPU numbering, and a fix for a boot-time race-condition failure - Put SRCU on a memory diet in order to reduce the size of the srcu_struct structure - Torture-test updates fixing some bugs in tests and closing some testing holes - Torture-test updates for the RCU tasks flavors, most notably ensuring that building rcutorture and friends does not change the RCU-tasks-related Kconfig options - Torture-test scripting updates - Expedited grace-period updates, most notably providing milliseconds-scale (not all that) soft real-time response from synchronize_rcu_expedited(). This is also the first time in almost 30 years of RCU that someone other than me has pushed for a reduction in the RCU CPU stall-warning timeout, in this case by more than three orders of magnitude from 21 seconds to 20 milliseconds. This tighter timeout applies only to expedited grace periods * tag 'rcu.2022.05.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits) rcu: Move expedited grace period (GP) work to RT kthread_worker rcu: Introduce CONFIG_RCU_EXP_CPU_STALL_TIMEOUT srcu: Drop needless initialization of sdp in srcu_gp_start() srcu: Prevent expedited GPs and blocking readers from consuming CPU srcu: Add contention check to call_srcu() srcu_data ->lock acquisition srcu: Automatically determine size-transition strategy at boot rcutorture: Make torture.sh allow for --kasan rcutorture: Make torture.sh refscale and rcuscale specify Tasks Trace RCU rcutorture: Make kvm.sh allow more memory for --kasan runs torture: Save "make allmodconfig" .config file scftorture: Remove extraneous "scf" from per_version_boot_params rcutorture: Adjust scenarios' Kconfig options for CONFIG_PREEMPT_DYNAMIC torture: Enable CSD-lock stall reports for scftorture torture: Skip vmlinux check for kvm-again.sh runs scftorture: Adjust for TASKS_RCU Kconfig option being selected rcuscale: Allow rcuscale without RCU Tasks Rude/Trace rcuscale: Allow rcuscale without RCU Tasks refscale: Allow refscale without RCU Tasks Rude/Trace refscale: Allow refscale without RCU Tasks rcutorture: Allow specifying per-scenario stat_interval ...
2022-05-23Merge tag 'efi-next-for-v5.19' of ↵Linus Torvalds3-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Allow runtime services to be re-enabled at boot on RT kernels. - Provide access to secrets injected into the boot image by CoCo hypervisors (COnfidential COmputing) - Use DXE services on x86 to make the boot image executable after relocation, if needed. - Prefer mirrored memory for randomized allocations. - Only randomize the placement of the kernel image on arm64 if the loader has not already done so. - Add support for obtaining the boot hartid from EFI on RISC-V. * tag 'efi-next-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: riscv/efi_stub: Add support for RISCV_EFI_BOOT_PROTOCOL efi: stub: prefer mirrored memory for randomized allocations efi/arm64: libstub: run image in place if randomized by the loader efi: libstub: pass image handle to handle_kernel_image() efi: x86: Set the NX-compatibility flag in the PE header efi: libstub: ensure allocated memory to be executable efi: libstub: declare DXE services table efi: Add missing prototype for efi_capsule_setup_info docs: security: Add secrets/coco documentation efi: Register efi_secret platform device if EFI secret area is declared virt: Add efi_secret module to expose confidential computing secrets efi: Save location of EFI confidential computing area efi: Allow to enable EFI runtime services by default on RT
2022-05-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds5-13/+20
Pull kvm fixes from Paolo Bonzini: "ARM: - Correctly expose GICv3 support even if no irqchip is created so that userspace doesn't observe it changing pointlessly (fixing a regression with QEMU) - Don't issue a hypercall to set the id-mapped vectors when protected mode is enabled (fix for pKVM in combination with CPUs affected by Spectre-v3a) x86 (five oneliners, of which the most interesting two are): - a NULL pointer dereference on INVPCID executed with paging disabled, but only if KVM is using shadow paging - an incorrect bsearch comparison function which could truncate the result and apply PMU event filtering incorrectly. This one comes with a selftests update too" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/mmu: fix NULL pointer dereference on guest INVPCID KVM: x86: hyper-v: fix type of valid_bank_mask KVM: Free new dirty bitmap if creating a new memslot fails KVM: eventfd: Fix false positive RCU usage warning selftests: kvm/x86: Verify the pmu event filter matches the correct event selftests: kvm/x86: Add the helper function create_pmu_event_filter kvm: x86/pmu: Fix the compare function used by the pmu event filter KVM: arm64: Don't hypercall before EL2 init KVM: arm64: vgic-v3: Consistently populate ID_AA64PFR0_EL1.GIC KVM: x86/mmu: Update number of zapped pages even if page list is stable
2022-05-20Merge tag 'riscv-for-linus-5.18-rc8' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - fix the fu540-c000 device tree to avoid a schema check failure on the DMA node name - fix typo in the PolarFire SOC device tree * tag 'riscv-for-linus-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: dts: microchip: fix gpio1 reg property typo riscv: dts: sifive: fu540-c000: align dma node name with dtschema
2022-05-20Merge tag 'arm64-fixes' of ↵Linus Torvalds3-15/+39
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Three arm64 fixes for -rc8/final. The MTE and stolen time fixes have been doing the rounds for a little while, but review and testing feedback was ongoing until earlier this week. The kexec fix showed up on Monday and addresses a failure observed under Qemu. Summary: - Add missing write barrier to publish MTE tags before a pte update - Fix kexec relocation clobbering its own data structures - Fix stolen time crash if a timer IRQ fires during CPU hotplug" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mte: Ensure the cleared tags are visible before setting the PTE arm64: kexec: load from kimage prior to clobbering arm64: paravirt: Use RCU read locks to guard stolen_time
2022-05-20KVM: x86/mmu: fix NULL pointer dereference on guest INVPCIDPaolo Bonzini1-2/+4
With shadow paging enabled, the INVPCID instruction results in a call to kvm_mmu_invpcid_gva. If INVPCID is executed with CR0.PG=0, the invlpg callback is not set and the result is a NULL pointer dereference. Fix it trivially by checking for mmu->invlpg before every call. There are other possibilities: - check for CR0.PG, because KVM (like all Intel processors after P5) flushes guest TLB on CR0.PG changes so that INVPCID/INVLPG are a nop with paging disabled - check for EFER.LMA, because KVM syncs and flushes when switching MMU contexts outside of 64-bit mode All of these are tricky, go for the simple solution. This is CVE-2022-1789. Reported-by: Yongkang Jia <kangel@zju.edu.cn> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-20KVM: x86: hyper-v: fix type of valid_bank_maskYury Norov1-2/+2
In kvm_hv_flush_tlb(), valid_bank_mask is declared as unsigned long, but is used as u64, which is wrong for i386, and has been spotted by LKP after applying "KVM: x86: hyper-v: replace bitmap_weight() with hweight64()" https://lore.kernel.org/lkml/20220510154750.212913-12-yury.norov@gmail.com/ But it's wrong even without that patch because now bitmap_weight() dereferences a word after valid_bank_mask on i386. >> include/asm-generic/bitops/const_hweight.h:21:76: warning: right shift count >= width of type +[-Wshift-count-overflow] 21 | #define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32)) | ^~ include/asm-generic/bitops/const_hweight.h:10:16: note: in definition of macro '__const_hweight8' 10 | ((!!((w) & (1ULL << 0))) + \ | ^ include/asm-generic/bitops/const_hweight.h:20:31: note: in expansion of macro '__const_hweight16' 20 | #define __const_hweight32(w) (__const_hweight16(w) + __const_hweight16((w) >> 16)) | ^~~~~~~~~~~~~~~~~ include/asm-generic/bitops/const_hweight.h:21:54: note: in expansion of macro '__const_hweight32' 21 | #define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32)) | ^~~~~~~~~~~~~~~~~ include/asm-generic/bitops/const_hweight.h:29:49: note: in expansion of macro '__const_hweight64' 29 | #define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w)) | ^~~~~~~~~~~~~~~~~ arch/x86/kvm/hyperv.c:1983:36: note: in expansion of macro 'hweight64' 1983 | if (hc->var_cnt != hweight64(valid_bank_mask)) | ^~~~~~~~~ CC: Borislav Petkov <bp@alien8.de> CC: Dave Hansen <dave.hansen@linux.intel.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Ingo Molnar <mingo@redhat.com> CC: Jim Mattson <jmattson@google.com> CC: Joerg Roedel <joro@8bytes.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Sean Christopherson <seanjc@google.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Wanpeng Li <wanpengli@tencent.com> CC: kvm@vger.kernel.org CC: linux-kernel@vger.kernel.org CC: x86@kernel.org Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Message-Id: <20220519171504.1238724-1-yury.norov@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-20kvm: x86/pmu: Fix the compare function used by the pmu event filterAaron Lewis1-2/+5
When returning from the compare function the u64 is truncated to an int. This results in a loss of the high nybble[1] in the event select and its sign if that nybble is in use. Switch from using a result that can end up being truncated to a result that can only be: 1, 0, -1. [1] bits 35:32 in the event select register and bits 11:8 in the event select. Fixes: 7ff775aca48ad ("KVM: x86/pmu: Use binary search to check filtered events") Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220517051238.2566934-1-aaronlewis@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-20x86/tdx: Fix RETs in TDX asmPeter Zijlstra1-2/+2
Because build-testing is over-rated, fix a few trivial objtool complaints: vmlinux.o: warning: objtool: __tdx_module_call+0x3e: missing int3 after ret vmlinux.o: warning: objtool: __tdx_hypercall+0x6e: missing int3 after ret Fixes: eb94f1b6a70a ("x86/tdx: Add __tdx_module_call() and __tdx_hypercall() helper functions") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220520083839.GR2578@worktop.programming.kicks-ass.net
2022-05-20x86/entry: Fixup objtool/ibt validationPeter Zijlstra2-0/+6
Commit 47f33de4aafb ("x86/sev: Mark the code returning to user space as syscall gap") added a bunch of text references without annotating them, resulting in a spree of objtool complaints: vmlinux.o: warning: objtool: vc_switch_off_ist+0x77: relocation to !ENDBR: entry_SYSCALL_64+0x15c vmlinux.o: warning: objtool: vc_switch_off_ist+0x8f: relocation to !ENDBR: entry_SYSCALL_compat+0xa5 vmlinux.o: warning: objtool: vc_switch_off_ist+0x97: relocation to !ENDBR: .entry.text+0x21ea vmlinux.o: warning: objtool: vc_switch_off_ist+0xef: relocation to !ENDBR: .entry.text+0x162 vmlinux.o: warning: objtool: __sev_es_ist_enter+0x60: relocation to !ENDBR: entry_SYSCALL_64+0x15c vmlinux.o: warning: objtool: __sev_es_ist_enter+0x6c: relocation to !ENDBR: .entry.text+0x162 vmlinux.o: warning: objtool: __sev_es_ist_enter+0x8a: relocation to !ENDBR: entry_SYSCALL_compat+0xa5 vmlinux.o: warning: objtool: __sev_es_ist_enter+0xc1: relocation to !ENDBR: .entry.text+0x21ea Since these text references are used to compare against IP, and are not an indirect call target, they don't need ENDBR so annotate them away. Fixes: 47f33de4aafb ("x86/sev: Mark the code returning to user space as syscall gap") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220520082604.GQ2578@worktop.programming.kicks-ass.net
2022-05-19riscv: dts: microchip: fix gpio1 reg property typoConor Paxton1-1/+1
Fix reg address typo in the gpio1 stanza. Signed-off-by: Conor Paxton <conor.paxton@microchip.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Fixes: 528a5b1f2556 ("riscv: dts: microchip: add new peripherals to icicle kit device tree") Link: https://lore.kernel.org/r/20220517104058.2004734-1-conor.paxton@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-05-19x86/entry: Fix register corruption in compat syscallJosh Poimboeuf2-5/+5
A panic was reported in the init process on AMD: Run /sbin/init as init process init[1]: segfault at f7fd5ca0 ip 00000000f7f5bbc7 sp 00000000ffa06aa0 error 7 in libc.so[f7f51000+4e000] Code: 8a 44 24 10 88 41 ff 8b 44 24 10 83 c4 2c 5b 5e 5f 5d c3 53 83 ec 08 8b 5c 24 10 81 fb 00 f0 ff ff 76 0c e8 ba dc ff ff f7 db <89> 18 83 cb ff 83 c4 08 89 d8 5b c3 e8 81 60 ff ff 05 28 84 07 00 Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b CPU: 1 PID: 1 Comm: init Tainted: G W 5.18.0-rc7-next-20220519 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x57/0x7d panic+0x10f/0x28d do_exit.cold+0x18/0x48 do_group_exit+0x2e/0xb0 get_signal+0xb6d/0xb80 arch_do_signal_or_restart+0x31/0x760 ? show_opcodes.cold+0x1c/0x21 ? force_sig_fault+0x49/0x70 exit_to_user_mode_prepare+0x131/0x1a0 irqentry_exit_to_user_mode+0x5/0x30 asm_exc_page_fault+0x27/0x30 RIP: 0023:0xf7f5bbc7 Code: 8a 44 24 10 88 41 ff 8b 44 24 10 83 c4 2c 5b 5e 5f 5d c3 53 83 ec 08 8b 5c 24 10 81 fb 00 f0 ff ff 76 0c e8 ba dc ff ff f7 db <89> 18 83 cb ff 83 c4 08 89 d8 5b c3 e8 81 60 ff ff 05 28 84 07 00 RSP: 002b:00000000ffa06aa0 EFLAGS: 00000217 RAX: 00000000f7fd5ca0 RBX: 000000000000000c RCX: 0000000000001000 RDX: 0000000000000001 RSI: 00000000f7fd5b60 RDI: 00000000f7fd5b60 RBP: 00000000f7fd1c1c R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> The task's CX register got corrupted by commit 8c42819b61b8 ("x86/entry: Use PUSH_AND_CLEAR_REGS for compat"), which overlooked the fact that compat SYSCALL apparently stores the user's CX value in BP. Before that commit, CX was saved from its stashed value in BP: pushq %rbp /* pt_regs->cx (stashed in bp) */ But then it got changed to: pushq %rcx /* pt_regs->cx */ So the wrong value got saved and later restored back to the user. Fix it by pushing the correct value again (BP) for regs->cx. Fixes: 8c42819b61b8 ("x86/entry: Use PUSH_AND_CLEAR_REGS for compat") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lkml.kernel.org/r/b5a26592c9dd60bbacdf97974a7433fd802a5593.1652985970.git.jpoimboe@kernel.org
2022-05-19riscv: dts: sifive: fu540-c000: align dma node name with dtschemaKrzysztof Kozlowski1-1/+1
Fixes dtbs_check warnings like: dma@3000000: $nodename:0: 'dma@3000000' does not match '^dma-controller(@.*)?$' Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20220407193856.18223-1-krzysztof.kozlowski@linaro.org Fixes: c5ab54e9945b ("riscv: dts: add support for PDMA device of HiFive Unleashed Rev A00") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-05-19Merge tag 'for-5.18/parisc-4' of ↵Linus Torvalds5-143/+251
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc architecture fixes from Helge Deller: "We had two big outstanding issues after v5.18-rc6: a) 32-bit kernels on 64-bit machines (e.g. on a C3700 which is able to run 32- and 64-bit kernels) failed early in userspace. b) 64-bit kernels on PA8800/PA8900 CPUs (e.g. in a C8000) showed random userspace segfaults. We assumed that those problems were caused by the tmpalias flushes. Dave did a lot of testing and reorganization of the current flush code and fixed the 32-bit cache flushing. For PA8800/PA8900 CPUs he switched the code to flush using the virtual address of user and kernel pages instead of using tmpalias flushes. The tmpalias flushes don't seem to work reliable on such CPUs. We tested the patches on a wide range machines (715/64, B160L, C3000, C3700, C8000, rp3440) and they have been in for-next without any conflicts. Summary: - Rewrite the cache flush code for PA8800/PA8900 CPUs to flush using the virtual address of user and kernel pages instead of using tmpalias flushes. Testing showed, that tmpalias flushes don't work reliably on PA8800/PA8900 CPUs - Fix flush code to allow 32-bit kernels to run on 64-bit capable machines, e.g. a 32-bit kernel on C3700 machines" * tag 'for-5.18/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix patch code locking and flushing parisc: Rewrite cache flush code for PA8800/PA8900 parisc: Disable debug code regarding cache flushes in handle_nadtlb_fault()
2022-05-19Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds2-1/+2
Pull ARM fixes from Russell King: "Two further fixes for Spectre-BHB from Ard for Cortex A15 and to use the wide branch instruction for Thumb2" * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9197/1: spectre-bhb: fix loop8 sequence for Thumb2 ARM: 9196/1: spectre-bhb: enable for Cortex-A15
2022-05-19x86/boot: Wrap literal addresses in absolute_pointer()Kees Cook2-13/+25
GCC 11 (incorrectly[1]) assumes that literal values cast to (void *) should be treated like a NULL pointer with an offset, and raises diagnostics when doing bounds checking under -Warray-bounds. GCC 12 got "smarter" about finding these: In function 'rdfs8', inlined from 'vga_recalc_vertical' at /srv/code/arch/x86/boot/video-mode.c:124:29, inlined from 'set_mode' at /srv/code/arch/x86/boot/video-mode.c:163:3: /srv/code/arch/x86/boot/boot.h:114:9: warning: array subscript 0 is outside array bounds of 'u8[0]' {aka 'unsigned char[]'} [-Warray-bounds] 114 | asm volatile("movb %%fs:%1,%0" : "=q" (v) : "m" (*(u8 *)addr)); | ^~~ This has been solved in other places[2] already by using the recently added absolute_pointer() macro. Do the same here. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578 [2] https://lore.kernel.org/all/20210912160149.2227137-1-linux@roeck-us.net/ Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20220227195918.705219-1-keescook@chromium.org
2022-05-19x86/sev: Mark the code returning to user space as syscall gapLai Jiangshan4-0/+12
When returning to user space, %rsp is user-controlled value. If it is a SNP-guest and the hypervisor decides to mess with the code-page for this path while a CPU is executing it, a potential #VC could hit in the syscall return path and mislead the #VC handler. So make ip_within_syscall_gap() return true in this case. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20220412124909.10467-1-jiangshanlai@gmail.com
2022-05-18Merge branch 'arm/fixes' of ↵Linus Torvalds6-11/+69
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "The SoC bug fixes have calmed down sufficiently, there is one minor update for the MAINTAINERS file, and few bug fixes for dts descriptions: - Updates to the BananaPi R2-Pro (rk3568) dts to match production hardware rather than the prototype version. - Qualcomm sm8250 soundwire gets disabled on some machines to avoid crashes - A number of aspeed SoC specific fixes, addressing incorrect pin cotrol settings, some values in the romed8hm board, and a revert for an accidental removal of a DT node" * 'arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: MAINTAINERS: omap: remove me as a maintainer ARM: dts: aspeed: Add video engine to g6 ARM: dts: aspeed: romed8hm3: Fix GPIOB0 name ARM: dts: aspeed: romed8hm3: Add lm25066 sense resistor values ARM: dts: aspeed-g6: fix SPI1/SPI2 quad pin group ARM: dts: aspeed-g6: add FWQSPI group in pinctrl dtsi dt-bindings: pinctrl: aspeed-g6: add FWQSPI function/group pinctrl: pinctrl-aspeed-g6: add FWQSPI function-group dt-bindings: pinctrl: aspeed-g6: remove FWQSPID group pinctrl: pinctrl-aspeed-g6: remove FWQSPID group in pinctrl ARM: dts: aspeed-g6: remove FWQSPID group in pinctrl dtsi arm64: dts: qcom: sm8250: don't enable rx/tx macro by default arm64: dts: rockchip: Add gmac1 and change network settings of bpi-r2-pro arm64: dts: rockchip: Change io-domains of bpi-r2-pro
2022-05-18x86/sev: Annotate stack change in the #VC handlerLai Jiangshan1-0/+1
In idtentry_vc(), vc_switch_off_ist() determines a safe stack to switch to, off of the IST stack. Annotate the new stack switch with ENCODE_FRAME_POINTER in case UNWINDER_FRAME_POINTER is used. A stack walk before looks like this: CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0-rc7+ #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl dump_stack kernel_exc_vmm_communication asm_exc_vmm_communication ? native_read_msr ? __x2apic_disable.part.0 ? x2apic_setup ? cpu_init ? trap_init ? start_kernel ? x86_64_start_reservations ? x86_64_start_kernel ? secondary_startup_64_no_verify </TASK> and with the fix, the stack dump is exact: CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0-rc7+ #3 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl dump_stack kernel_exc_vmm_communication asm_exc_vmm_communication RIP: 0010:native_read_msr Code: ... < snipped regs > ? __x2apic_disable.part.0 x2apic_setup cpu_init trap_init start_kernel x86_64_start_reservations x86_64_start_kernel secondary_startup_64_no_verify </TASK> [ bp: Test in a SEV-ES guest and rewrite the commit message to explain what exactly this does. ] Fixes: a13644f3a53d ("x86/entry/64: Add entry code for #VC handler") Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220316041612.71357-1-jiangshanlai@gmail.com
2022-05-18ARM: 9197/1: spectre-bhb: fix loop8 sequence for Thumb2Ard Biesheuvel1-1/+1
In Thumb2, 'b . + 4' produces a branch instruction that uses a narrow encoding, and so it does not jump to the following instruction as expected. So use W(b) instead. Fixes: 6c7cb60bff7a ("ARM: fix Thumb2 regression with Spectre BHB") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-05-18ARM: 9196/1: spectre-bhb: enable for Cortex-A15Ard Biesheuvel1-0/+1
The Spectre-BHB mitigations were inadvertently left disabled for Cortex-A15, due to the fact that cpu_v7_bugs_init() is not called in that case. So fix that. Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-05-17parisc: Fix patch code locking and flushingJohn David Anglin1-14/+11
This change fixes the following: 1) The flags variable is not initialized. Always use raw_spin_lock_irqsave and raw_spin_unlock_irqrestore to serialize patching. 2) flush_kernel_vmap_range is primarily intended for DMA flushes. The whole cache flush in flush_kernel_vmap_range is only possible when interrupts are enabled on SMP machines. Since __patch_text_multiple calls flush_kernel_vmap_range with interrupts disabled, it is better to directly call flush_kernel_dcache_range_asm and flush_kernel_icache_range_asm. 3) The final call to flush_icache_range is unnecessary. Tested with `[PATCH, V3] parisc: Rewrite cache flush code for PA8800/PA8900' change on rp3440, c8000 and c3750 (32 and 64-bit). Note by Helge: This patch had been temporarily reverted shortly before v5.18-rc6 in order to fix boot issues. Now it can be re-applied. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2022-05-17parisc: Rewrite cache flush code for PA8800/PA8900John David Anglin3-127/+236
Originally, I was convinced that we needed to use tmpalias flushes everwhere, for both user and kernel flushes. However, when I modified flush_kernel_dcache_page_addr, to use a tmpalias flush, my c8000 would crash quite early when booting. The PDC returns alias values of 0 for the icache and dcache. This indicates that either the alias boundary is greater than 16MB or equivalent aliasing doesn't work. I modified the tmpalias code to make it easy to try alternate boundaries. I tried boundaries up to 128MB but still kernel tmpalias flushes didn't work on c8000. This led me to conclude that tmpalias flushes don't work on PA8800 and PA8900 machines, and that we needed to flush directly using the virtual address of user and kernel pages. This is likely the major cause of instability on the c8000 and rp34xx machines. Flushing user pages requires doing a temporary context switch as we have to flush pages that don't belong to the current context. Further, we have to deal with pages that aren't present. If a page isn't present, the flush instructions fault on every line. Other code has been rearranged and simplified based on testing. For example, I introduced a flush_cache_dup_mm routine. flush_cache_mm and flush_cache_dup_mm differ in that flush_cache_mm calls purge_cache_pages and flush_cache_dup_mm calls flush_cache_pages. In some implementations, pdc is more efficient than fdc. Based on my testing, I don't believe there's any performance benefit on the c8000. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2022-05-17parisc: Disable debug code regarding cache flushes in handle_nadtlb_fault()John David Anglin1-2/+4
Change the "BUG" to "WARNING" and disable the message because it triggers occasionally in spite of the check in flush_cache_page_if_present. The pte value extracted for the "from" page in copy_user_highpage is racy and occasionally the pte is cleared before the flush is complete. I assume that the page is simultaneously flushed by flush_cache_mm before the pte is cleared as nullifying the fdc doesn't seem to cause problems. I investigated various locking scenarios but I wasn't able to find a way to sequence the flushes. This code is called for every COW break and locks impact performance. This patch is related to the bigger cache flush patch because we need the pte on PA8800/PA8900 to flush using the vma context. I have also seen this from copy_to_user_page and copy_from_user_page. The messages appear infrequently when enabled. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2022-05-17Merge tag 'kvmarm-fixes-5.18-3' of ↵Paolo Bonzini2-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.18, take #3 - Correctly expose GICv3 support even if no irqchip is created so that userspace doesn't observe it changing pointlessly (fixing a regression with QEMU) - Don't issue a hypercall to set the id-mapped vectors when protected mode is enabled (fix for pKVM in combination with CPUs affected by Spectre-v3a)
2022-05-17arm64: mte: Ensure the cleared tags are visible before setting the PTECatalin Marinas1-0/+3
As an optimisation, only pages mapped with PROT_MTE in user space have the MTE tags zeroed. This is done lazily at the set_pte_at() time via mte_sync_tags(). However, this function is missing a barrier and another CPU may see the PTE updated before the zeroed tags are visible. Add an smp_wmb() barrier if the mapping is Normal Tagged. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Fixes: 34bfeea4a9e9 ("arm64: mte: Clear the tags when a page is mapped in user-space with PROT_MTE") Cc: <stable@vger.kernel.org> # 5.10.x Reported-by: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Steven Price <steven.price@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Link: https://lore.kernel.org/r/20220517093532.127095-1-catalin.marinas@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-05-17arm64: kexec: load from kimage prior to clobberingMark Rutland1-7/+15
In arm64_relocate_new_kernel() we load some fields out of the kimage structure after relocation has occurred. As the kimage structure isn't allocated to be relocation-safe, it may be clobbered during relocation, and we may load junk values out of the structure. Due to this, kexec may fail when the kimage allocation happens to fall within a PA range that an object will be relocated to. This has been observed to occur for regular kexec on a QEMU TCG 'virt' machine with 2GiB of RAM, where the PA range of the new kernel image overlaps the kimage structure. Avoid this by ensuring we load all values from the kimage structure prior to relocation. I've tested this atop v5.16 and v5.18-rc6. Fixes: 878fdbd70486 ("arm64: kexec: pass kimage as the only argument to relocation function") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Link: https://lore.kernel.org/r/20220516160735.731404-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-05-17arm64: paravirt: Use RCU read locks to guard stolen_timePrakruthi Deepak Heragu1-8/+21
During hotplug, the stolen time data structure is unmapped and memset. There is a possibility of the timer IRQ being triggered before memset and stolen time is getting updated as part of this timer IRQ handler. This causes the below crash in timer handler - [ 3457.473139][ C5] Unable to handle kernel paging request at virtual address ffffffc03df05148 ... [ 3458.154398][ C5] Call trace: [ 3458.157648][ C5] para_steal_clock+0x30/0x50 [ 3458.162319][ C5] irqtime_account_process_tick+0x30/0x194 [ 3458.168148][ C5] account_process_tick+0x3c/0x280 [ 3458.173274][ C5] update_process_times+0x5c/0xf4 [ 3458.178311][ C5] tick_sched_timer+0x180/0x384 [ 3458.183164][ C5] __run_hrtimer+0x160/0x57c [ 3458.187744][ C5] hrtimer_interrupt+0x258/0x684 [ 3458.192698][ C5] arch_timer_handler_virt+0x5c/0xa0 [ 3458.198002][ C5] handle_percpu_devid_irq+0xdc/0x414 [ 3458.203385][ C5] handle_domain_irq+0xa8/0x168 [ 3458.208241][ C5] gic_handle_irq.34493+0x54/0x244 [ 3458.213359][ C5] call_on_irq_stack+0x40/0x70 [ 3458.218125][ C5] do_interrupt_handler+0x60/0x9c [ 3458.223156][ C5] el1_interrupt+0x34/0x64 [ 3458.227560][ C5] el1h_64_irq_handler+0x1c/0x2c [ 3458.232503][ C5] el1h_64_irq+0x7c/0x80 [ 3458.236736][ C5] free_vmap_area_noflush+0x108/0x39c [ 3458.242126][ C5] remove_vm_area+0xbc/0x118 [ 3458.246714][ C5] vm_remove_mappings+0x48/0x2a4 [ 3458.251656][ C5] __vunmap+0x154/0x278 [ 3458.255796][ C5] stolen_time_cpu_down_prepare+0xc0/0xd8 [ 3458.261542][ C5] cpuhp_invoke_callback+0x248/0xc34 [ 3458.266842][ C5] cpuhp_thread_fun+0x1c4/0x248 [ 3458.271696][ C5] smpboot_thread_fn+0x1b0/0x400 [ 3458.276638][ C5] kthread+0x17c/0x1e0 [ 3458.280691][ C5] ret_from_fork+0x10/0x20 As a fix, introduce rcu lock to update stolen time structure. Fixes: 75df529bec91 ("arm64: paravirt: Initialize steal time when cpu is online") Cc: stable@vger.kernel.org Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> Reviewed-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Link: https://lore.kernel.org/r/20220513174654.362169-1-quic_eberman@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2022-05-17x86/sev: Remove duplicated assignment to variable infoColin Ian King1-4/+1
Variable info is being assigned the same value twice, remove the redundant assignment. Also assign variable v in the declaration. Cleans up clang scan warning: warning: Value stored to 'info' during its initialization is never read [deadcode.DeadStores] No code changed: # arch/x86/kernel/sev.o: text data bss dec hex filename 19878 4487 4112 28477 6f3d sev.o.before 19878 4487 4112 28477 6f3d sev.o.after md5: bfbaa515af818615fd01fea91e7eba1b sev.o.before.asm bfbaa515af818615fd01fea91e7eba1b sev.o.after.asm [ bp: Running the before/after check on sev.c because sev-shared.c gets included into it. ] Fixes: 597cfe48212a ("x86/boot/compressed/64: Setup a GHCB-based VC Exception handler") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220516184215.51841-1-colin.i.king@gmail.com
2022-05-17Merge branch irq/gic-v3-nmi-fixes-5.19 into irq/irqchip-nextMarc Zyngier2-12/+1
* irq/gic-v3-nmi-fixes-5.19: : . : GICv3 pseudo-NMI fixes from Mark Rutland: : : "These patches fix a couple of issues with the way GICv3 pseudo-NMIs are : handled: : : * The first patch adds a barrier we missed from NMI handling due to an : oversight. : : * The second patch refactors some logic around reads from ICC_IAR1_EL1 : and adds commentary to explain what's going on. : : * The third patch descends into madness, reworking gic_handle_irq() to : consistently manage ICC_PMR_EL1 + DAIF and avoid cases where these can : be left in an inconsistent state while softirqs are processed." : . irqchip/gic-v3: Fix priority mask handling irqchip/gic-v3: Refactor ISB + EOIR at ack time irqchip/gic-v3: Ensure pseudo-NMIs have an ISB between ack and handling Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-17irqchip: Add Kconfig symbols for sunxi driversSamuel Holland2-6/+11
Not all of these drivers are needed on every ARCH_SUNXI platform. In particular, the ARCH_SUNXI symbol will be reused for the Allwinner D1, a RISC-V SoC which contains none of these irqchips. Introduce Kconfig symbols so we can select only the drivers actually used by a particular set of platforms. This also lets us move the irqchip driver dependencies to a more appropriate location. Signed-off-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220509034941.30704-1-samuel@sholland.org
2022-05-15irqchip/gic-v3: Refactor ISB + EOIR at ack timeMark Rutland2-12/+1
There are cases where a context synchronization event is necessary between an IRQ being raised and being handled, and there are races such that we cannot rely upon the exception entry being subsequent to the interrupt being raised. To fix this, we place an ISB between a read of IAR and the subsequent invocation of an IRQ handler. When EOI mode 1 is in use, we need to EOI an interrupt prior to invoking its handler, and we have a write to EOIR for this. As this write to EOIR requires an ISB, and this is provided by the gic_write_eoir() helper, we omit the usual ISB in this case, with the logic being: | if (static_branch_likely(&supports_deactivate_key)) | gic_write_eoir(irqnr); | else | isb(); This is somewhat opaque, and it would be a little clearer if there were an unconditional ISB, with only the write to EOIR being conditional, e.g. | if (static_branch_likely(&supports_deactivate_key)) | write_gicreg(irqnr, ICC_EOIR1_EL1); | | isb(); This patch rewrites the code that way, with this logic factored into a new helper function with comments explaining what the ISB is for, as were originally laid out in commit: 39a06b67c2c1256b ("irqchip/gic: Ensure we have an ISB between ack and ->handle_irq") Note that since then, we removed the IAR polling in commit: 342677d70ab92142 ("irqchip/gic-v3: Remove acknowledge loop") ... which removed one of the two race conditions. For consistency, other portions of the driver are made to manipulate EOIR using write_gicreg() and explcit ISBs, and the gic_write_eoir() helper function is removed. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220513133038.226182-3-mark.rutland@arm.com
2022-05-15Merge tag 'powerpc-5.18-5' of ↵Linus Torvalds1-5/+21
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: - Fix KVM PR on 32-bit, which was broken by some MMU code refactoring. Thanks to: Alexander Graf, and Matt Evans. * tag 'powerpc-5.18-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: KVM: PPC: Book3S PR: Enable MSR_DR for switch_mmu_context()
2022-05-15Merge tag 'x86-urgent-2022-05-15' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single fix for the handling of unpopulated sub-pmd spaces. The copy & pasta from the corresponding s390 code screwed up the address calculation for marking the sub-pmd ranges via memset by omitting the ALIGN_DOWN() to calculate the proper start address. It's a mystery why this code is not generic and shared because there is nothing architecture specific in there, but that's too intrusive for a backportable fix" * tag 'x86-urgent-2022-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Fix marking of unused sub-pmd ranges
2022-05-15KVM: arm64: Don't hypercall before EL2 initQuentin Perret1-1/+2
Will reported the following splat when running with Protected KVM enabled: [ 2.427181] ------------[ cut here ]------------ [ 2.427668] WARNING: CPU: 3 PID: 1 at arch/arm64/kvm/mmu.c:489 __create_hyp_private_mapping+0x118/0x1ac [ 2.428424] Modules linked in: [ 2.429040] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc2-00084-g8635adc4efc7 #1 [ 2.429589] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 [ 2.430286] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 2.430734] pc : __create_hyp_private_mapping+0x118/0x1ac [ 2.431091] lr : create_hyp_exec_mappings+0x40/0x80 [ 2.431377] sp : ffff80000803baf0 [ 2.431597] x29: ffff80000803bb00 x28: 0000000000000000 x27: 0000000000000000 [ 2.432156] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 [ 2.432561] x23: ffffcd96c343b000 x22: 0000000000000000 x21: ffff80000803bb40 [ 2.433004] x20: 0000000000000004 x19: 0000000000001800 x18: 0000000000000000 [ 2.433343] x17: 0003e68cf7efdd70 x16: 0000000000000004 x15: fffffc81f602a2c8 [ 2.434053] x14: ffffdf8380000000 x13: ffffcd9573200000 x12: ffffcd96c343b000 [ 2.434401] x11: 0000000000000004 x10: ffffcd96c1738000 x9 : 0000000000000004 [ 2.434812] x8 : ffff80000803bb40 x7 : 7f7f7f7f7f7f7f7f x6 : 544f422effff306b [ 2.435136] x5 : 000000008020001e x4 : ffff207d80a88c00 x3 : 0000000000000005 [ 2.435480] x2 : 0000000000001800 x1 : 000000014f4ab800 x0 : 000000000badca11 [ 2.436149] Call trace: [ 2.436600] __create_hyp_private_mapping+0x118/0x1ac [ 2.437576] create_hyp_exec_mappings+0x40/0x80 [ 2.438180] kvm_init_vector_slots+0x180/0x194 [ 2.458941] kvm_arch_init+0x80/0x274 [ 2.459220] kvm_init+0x48/0x354 [ 2.459416] arm_init+0x20/0x2c [ 2.459601] do_one_initcall+0xbc/0x238 [ 2.459809] do_initcall_level+0x94/0xb4 [ 2.460043] do_initcalls+0x54/0x94 [ 2.460228] do_basic_setup+0x1c/0x28 [ 2.460407] kernel_init_freeable+0x110/0x178 [ 2.460610] kernel_init+0x20/0x1a0 [ 2.460817] ret_from_fork+0x10/0x20 [ 2.461274] ---[ end trace 0000000000000000 ]--- Indeed, the Protected KVM mode promotes __create_hyp_private_mapping() to a hypercall as EL1 no longer has access to the hypervisor's stage-1 page-table. However, the call from kvm_init_vector_slots() happens after pKVM has been initialized on the primary CPU, but before it has been initialized on secondaries. As such, if the KVM initcall procedure is migrated from one CPU to another in this window, the hypercall may end up running on a CPU for which EL2 has not been initialized. Fortunately, the pKVM hypervisor doesn't rely on the host to re-map the vectors in the private range, so the hypercall in question is in fact superfluous. Skip it when pKVM is enabled. Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> [maz: simplified the checks slightly] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220513092607.35233-1-qperret@google.com
2022-05-15KVM: arm64: vgic-v3: Consistently populate ID_AA64PFR0_EL1.GICMarc Zyngier1-2/+1
When adding support for the slightly wonky Apple M1, we had to populate ID_AA64PFR0_EL1.GIC==1 to present something to the guest, as the HW itself doesn't advertise the feature. However, we gated this on the in-kernel irqchip being created. This causes some trouble for QEMU, which snapshots the state of the registers before creating a virtual GIC, and then tries to restore these registers once the GIC has been created. Obviously, between the two stages, ID_AA64PFR0_EL1.GIC has changed value, and the write fails. The fix is to actually emulate the HW, and always populate the field if the HW is capable of it. Fixes: 562e530fd770 ("KVM: arm64: Force ID_AA64PFR0_EL1.GIC=1 when exposing a virtual GICv3") Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Reported-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Oliver Upton <oupton@google.com> Link: https://lore.kernel.org/r/20220503211424.3375263-1-maz@kernel.org
2022-05-13Merge tag 'mm-hotfixes-stable-2022-05-11' of ↵Linus Torvalds4-0/+23
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Seven MM fixes, three of which address issues added in the most recent merge window, four of which are cc:stable. Three non-MM fixes, none very serious" [ And yes, that's a real pull request from Andrew, not me creating a branch from emailed patches. Woo-hoo! ] * tag 'mm-hotfixes-stable-2022-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: MAINTAINERS: add a mailing list for DAMON development selftests: vm: Makefile: rename TARGETS to VMTARGETS mm/kfence: reset PG_slab and memcg_data before freeing __kfence_pool mailmap: add entry for martyna.szapar-mudlaw@intel.com arm[64]/memremap: don't abuse pfn_valid() to ensure presence of linear map procfs: prevent unprivileged processes accessing fdinfo dir mm: mremap: fix sign for EFAULT error return value mm/hwpoison: use pr_err() instead of dump_page() in get_any_page() mm/huge_memory: do not overkill when splitting huge_zero_page Revert "mm/memory-failure.c: skip huge_zero_page in memory_failure()"
2022-05-13Merge tag 'arm64-fixes' of ↵Linus Torvalds5-8/+7
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: - TLB invalidation workaround for Qualcomm Kryo-4xx "gold" CPUs - Fix broken dependency in the vDSO Makefile - Fix pointer authentication overrides in ISAR2 ID register * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Enable repeat tlbi workaround on KRYO4XX gold CPUs arm64: cpufeature: remove duplicate ID_AA64ISAR2_EL1 entry arm64: vdso: fix makefile dependency on vdso.so
2022-05-13x86/mm: Fix marking of unused sub-pmd rangesAdrian-Ken Rueegsegger1-2/+3
The unused part precedes the new range spanned by the start, end parameters of vmemmap_use_new_sub_pmd(). This means it actually goes from ALIGN_DOWN(start, PMD_SIZE) up to start. Use the correct address when applying the mark using memset. Fixes: 8d400913c231 ("x86/vmemmap: handle unpopulated sub-pmd ranges") Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220509090637.24152-2-ken@codelabs.ch