linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2020-12-24	Merge tag 'irq-core-2020-12-23' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "This is the second attempt after the first one failed miserably and got zapped to unblock the rest of the interrupt related patches. A treewide cleanup of interrupt descriptor (ab)use with all sorts of racy accesses, inefficient and disfunctional code. The goal is to remove the export of irq_to_desc() to prevent these things from creeping up again" * tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) genirq: Restrict export of irq_to_desc() xen/events: Implement irq distribution xen/events: Reduce irq_info:: Spurious_cnt storage size xen/events: Only force affinity mask for percpu interrupts xen/events: Use immediate affinity setting xen/events: Remove disfunct affinity spreading xen/events: Remove unused bind_evtchn_to_irq_lateeoi() net/mlx5: Use effective interrupt affinity net/mlx5: Replace irq_to_desc() abuse net/mlx4: Use effective interrupt affinity net/mlx4: Replace irq_to_desc() abuse PCI: mobiveil: Use irq_data_get_irq_chip_data() PCI: xilinx-nwl: Use irq_data_get_irq_chip_data() NTB/msi: Use irq_has_action() mfd: ab8500-debugfs: Remove the racy fiddling with irq_desc pinctrl: nomadik: Use irq_has_action() drm/i915/pmu: Replace open coded kstat_irqs() copy drm/i915/lpe_audio: Remove pointless irq_to_desc() usage s390/irq: Use irq_desc_kstat_cpu() in show_msi_interrupt() parisc/irq: Use irq_desc_kstat_cpu() in show_interrupts() ...
2020-12-20	epoll: fix compat syscall wire up of epoll_pwait2	Heiko Carstens	1	-1/+1
	Commit b0a0c2615f6f ("epoll: wire up syscall epoll_pwait2") wired up the 64 bit syscall instead of the compat variant in a couple of places. Fixes: b0a0c2615f6f ("epoll: wire up syscall epoll_pwait2") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Willem de Bruijn <willemb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-19	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds	1	-0/+1
	Merge still more updates from Andrew Morton: "18 patches. Subsystems affected by this patch series: mm (memcg and cleanups) and epoll" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/Kconfig: fix spelling mistake "whats" -> "what's" selftests/filesystems: expand epoll with epoll_pwait2 epoll: wire up syscall epoll_pwait2 epoll: add syscall epoll_pwait2 epoll: convert internal api to timespec64 epoll: eliminate unnecessary lock for zero timeout epoll: replace gotos with a proper loop epoll: pull all code between fetch_events and send_event into the loop epoll: simplify and optimize busy loop logic epoll: move eavail next to the list_empty_careful check epoll: pull fatal signal checks into ep_send_events() epoll: simplify signal handling epoll: check for events when removing a timed out thread from the wait queue mm/memcontrol:rewrite mem_cgroup_page_lruvec() mm, kvm: account kvm_vcpu_mmap to kmemcg mm/memcg: remove unused definitions mm/memcg: warning on !memcg after readahead page charged mm/memcg: bail early from swap accounting if memcg disabled
2020-12-19	epoll: wire up syscall epoll_pwait2	Willem de Bruijn	1	-0/+1
	Split off from prev patch in the series that implements the syscall. Link: https://lkml.kernel.org/r/20201121144401.3727659-4-willemdebruijn.kernel@gmail.com Signed-off-by: Willem de Bruijn <willemb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-18	Merge tag 's390-5.11-2' of ↵	Linus Torvalds	4	-26/+11
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: "This is mainly to decouple udelay() and arch_cpu_idle() and simplify both of them. Summary: - Always initialize kernel stack backchain when entering the kernel, so that unwinding works properly. - Fix stack unwinder test case to avoid rare interrupt stack corruption. - Simplify udelay() and just let it busy loop instead of implementing a complex logic. - arch_cpu_idle() cleanup. - Some other minor improvements" * tag 's390-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/zcrypt: convert comma to semicolon s390/idle: allow arch_cpu_idle() to be kprobed s390/idle: remove raw_local_irq_save()/restore() from arch_cpu_idle() s390/idle: merge enabled_wait() and arch_cpu_idle() s390/delay: remove udelay_simple() s390/irq: select HAVE_IRQ_EXIT_ON_IRQ_STACK s390/delay: simplify udelay s390/test_unwind: use timer instead of udelay s390/test_unwind: fix CALL_ON_STACK tests s390: make calls to TRACE_IRQS_OFF/TRACE_IRQS_ON balanced s390: always clear kernel stack backchain before calling functions
2020-12-17	Merge tag 'trace-v5.11' of ↵	Linus Torvalds	1	-4/+16
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The major update to this release is that there's a new arch config option called CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS. Currently, only x86_64 enables it. All the ftrace callbacks now take a struct ftrace_regs instead of a struct pt_regs. If the architecture has HAVE_DYNAMIC_FTRACE_WITH_ARGS enabled, then the ftrace_regs will have enough information to read the arguments of the function being traced, as well as access to the stack pointer. This way, if a user (like live kernel patching) only cares about the arguments, then it can avoid using the heavier weight "regs" callback, that puts in enough information in the struct ftrace_regs to simulate a breakpoint exception (needed for kprobes). A new config option that audits the timestamps of the ftrace ring buffer at most every event recorded. Ftrace recursion protection has been cleaned up to move the protection to the callback itself (this saves on an extra function call for those callbacks). Perf now handles its own RCU protection and does not depend on ftrace to do it for it (saving on that extra function call). New debug option to add "recursed_functions" file to tracefs that lists all the places that triggered the recursion protection of the function tracer. This will show where things need to be fixed as recursion slows down the function tracer. The eval enum mapping updates done at boot up are now offloaded to a work queue, as it caused a noticeable pause on slow embedded boards. Various clean ups and last minute fixes" * tag 'trace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits) tracing: Offload eval map updates to a work queue Revert: "ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS" ring-buffer: Add rb_check_bpage in __rb_allocate_pages ring-buffer: Fix two typos in comments tracing: Drop unneeded assignment in ring_buffer_resize() tracing: Disable ftrace selftests when any tracer is running seq_buf: Avoid type mismatch for seq_buf_init ring-buffer: Fix a typo in function description ring-buffer: Remove obsolete rb_event_is_commit() ring-buffer: Add test to validate the time stamp deltas ftrace/documentation: Fix RST C code blocks tracing: Clean up after filter logic rewriting tracing: Remove the useless value assignment in test_create_synth_event() livepatch: Use the default ftrace_ops instead of REGS when ARGS is available ftrace/x86: Allow for arguments to be passed in to ftrace_regs by default ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regs MAINTAINERS: assign ./fs/tracefs to TRACING tracing: Fix some typos in comments ftrace: Remove unused varible 'ret' ring-buffer: Add recording of ring buffer recursion into recursed_functions ...
2020-12-16	Merge tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-block	Linus Torvalds	2	-6/+7
	Pull TIF_NOTIFY_SIGNAL updates from Jens Axboe: "This sits on top of of the core entry/exit and x86 entry branch from the tip tree, which contains the generic and x86 parts of this work. Here we convert the rest of the archs to support TIF_NOTIFY_SIGNAL. With that done, we can get rid of JOBCTL_TASK_WORK from task_work and signal.c, and also remove a deadlock work-around in io_uring around knowing that signal based task_work waking is invoked with the sighand wait queue head lock. The motivation for this work is to decouple signal notify based task_work, of which io_uring is a heavy user of, from sighand. The sighand lock becomes a huge contention point, particularly for threaded workloads where it's shared between threads. Even outside of threaded applications it's slower than it needs to be. Roman Gershman <romger@amazon.com> reported that his networked workload dropped from 1.6M QPS at 80% CPU to 1.0M QPS at 100% CPU after io_uring was changed to use TIF_NOTIFY_SIGNAL. The time was all spent hammering on the sighand lock, showing 57% of the CPU time there [1]. There are further cleanups possible on top of this. One example is TIF_PATCH_PENDING, where a patch already exists to use TIF_NOTIFY_SIGNAL instead. Hopefully this will also lead to more consolidation, but the work stands on its own as well" [1] https://github.com/axboe/liburing/issues/215 * tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-block: (28 commits) io_uring: remove 'twa_signal_ok' deadlock work-around kernel: remove checking for TIF_NOTIFY_SIGNAL signal: kill JOBCTL_TASK_WORK io_uring: JOBCTL_TASK_WORK is no longer used by task_work task_work: remove legacy TWA_SIGNAL path sparc: add support for TIF_NOTIFY_SIGNAL riscv: add support for TIF_NOTIFY_SIGNAL nds32: add support for TIF_NOTIFY_SIGNAL ia64: add support for TIF_NOTIFY_SIGNAL h8300: add support for TIF_NOTIFY_SIGNAL c6x: add support for TIF_NOTIFY_SIGNAL alpha: add support for TIF_NOTIFY_SIGNAL xtensa: add support for TIF_NOTIFY_SIGNAL arm: add support for TIF_NOTIFY_SIGNAL microblaze: add support for TIF_NOTIFY_SIGNAL hexagon: add support for TIF_NOTIFY_SIGNAL csky: add support for TIF_NOTIFY_SIGNAL openrisc: add support for TIF_NOTIFY_SIGNAL sh: add support for TIF_NOTIFY_SIGNAL um: add support for TIF_NOTIFY_SIGNAL ...
2020-12-16	s390/idle: allow arch_cpu_idle() to be kprobed	Heiko Carstens	1	-2/+0
	Remove NOKPROBE_SYMBOL() for arch_cpu_idle(). This might have made sense when enabled_wait() (aka arch_cpu_idle()) was called from udelay. But now there shouldn't be a reason why s390 should be the only architecture which doesn't allow arch_cpu_idle() to be probed. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-16	s390/idle: remove raw_local_irq_save()/restore() from arch_cpu_idle()	Heiko Carstens	1	-5/+2
	arch_cpu_idle() gets called with interrupts disabled, and psw_idle() returns with interrupts disabled. No reason to use raw_local_irq_save() / restore(). Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-16	s390/idle: merge enabled_wait() and arch_cpu_idle()	Heiko Carstens	1	-8/+3
	The only caller of enabled_wait() besides arch_cpu_idle() was udelay(). Since that call doesn't exist anymore, merge enabled_wait() and arch_cpu_idle(). Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-16	s390/delay: remove udelay_simple()	Heiko Carstens	2	-2/+1
	udelay_simple() callers can make use of the now simplified udelay() implementation. No need to keep it. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-16	s390/delay: simplify udelay	Heiko Carstens	1	-4/+0
	udelay is implemented by using quite subtle details to make it possible to load an idle psw and waiting for an interrupt even in irq context or when interrupts are disabled. Also handling (or better: no handling) of softirqs is taken into account. All this is done to optimize for something which should in normal circumstances never happen: calling udelay to busy wait. Therefore get rid of the whole complexity and just busy loop like other architectures are doing it also. It could have been possible to use diag 0x44 instead of cpu_relax() in the busy loop, however we have seen too many bad things happen with diag 0x44 that it seems to be better to simply busy loop. Also note that with this new implementation kernel preemption does work when within the udelay loop. This did not work before. To get a feeling what the former code optimizes for: IPL'ing a kernel with 'defconfig' and afterwards compiling a kernel ends with a total of zero udelay calls. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-16	s390: make calls to TRACE_IRQS_OFF/TRACE_IRQS_ON balanced	Heiko Carstens	1	-2/+2
	In case of udelay CIF_IGNORE_IRQ is set. This leads to an unbalanced call of TRACE_IRQS_OFF and TRACE_IRQS_ON. That is: from lockdep's point of view TRACE_IRQS_ON is called one time too often. This doesn't fix any real bug, just makes the calls balanced. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-16	s390: always clear kernel stack backchain before calling functions	Heiko Carstens	1	-6/+6
	Clear the kernel stack backchain before potentially calling the lockdep trace_hardirqs_off/on functions. Without this walking the kernel backchain, e.g. during a panic, might stop too early. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-15	Merge tag 'irq-core-2020-12-15' of ↵	Linus Torvalds	1	-18/+33
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Generic interrupt and irqchips subsystem updates. Unusually, there is not a single completely new irq chip driver, just new DT bindings and extensions of existing drivers to accomodate new variants! Core: - Consolidation and robustness changes for irq time accounting - Cleanup and consolidation of irq stats - Remove the fasteoi IPI flow which has been proved useless - Provide an interface for converting legacy interrupt mechanism into irqdomains Drivers: - Preliminary support for managed interrupts on platform devices - Correctly identify allocation of MSIs proxyied by another device - Generalise the Ocelot support to new SoCs - Improve GICv4.1 vcpu entry, matching the corresponding KVM optimisation - Work around spurious interrupts on Qualcomm PDC - Random fixes and cleanups" * tag 'irq-core-2020-12-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits) irqchip/qcom-pdc: Fix phantom irq when changing between rising/falling driver core: platform: Add devm_platform_get_irqs_affinity() ACPI: Drop acpi_dev_irqresource_disabled() resource: Add irqresource_disabled() genirq/affinity: Add irq_update_affinity_desc() irqchip/gic-v3-its: Flag device allocation as proxied if behind a PCI bridge irqchip/gic-v3-its: Tag ITS device as shared if allocating for a proxy device platform-msi: Track shared domain allocation irqchip/ti-sci-intr: Fix freeing of irqs irqchip/ti-sci-inta: Fix printing of inta id on probe success drivers/irqchip: Remove EZChip NPS interrupt controller Revert "genirq: Add fasteoi IPI flow" irqchip/hip04: Make IPIs use handle_percpu_devid_irq() irqchip/bcm2836: Make IPIs use handle_percpu_devid_irq() irqchip/armada-370-xp: Make IPIs use handle_percpu_devid_irq() irqchip/gic, gic-v3: Make SGIs use handle_percpu_devid_irq() irqchip/ocelot: Add support for Jaguar2 platforms irqchip/ocelot: Add support for Serval platforms irqchip/ocelot: Add support for Luton platforms irqchip/ocelot: prepare to support more SoC ...
2020-12-15	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds	1	-10/+1
	Merge misc updates from Andrew Morton: - a few random little subsystems - almost all of the MM patches which are staged ahead of linux-next material. I'll trickle to post-linux-next work in as the dependents get merged up. Subsystems affected by this patch series: kthread, kbuild, ide, ntfs, ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation, kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction, oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc, uaccess, zram, and cleanups). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (200 commits) mm: cleanup kstrto() usage mm: fix fall-through warnings for Clang mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at mm: shmem: convert shmem_enabled_show to use sysfs_emit_at mm:backing-dev: use sysfs_emit in macro defining functions mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening mm: use sysfs_emit for struct kobject uses mm: fix kernel-doc markups zram: break the strict dependency from lzo zram: add stat to gather incompressible pages since zram set up zram: support page writeback mm/process_vm_access: remove redundant initialization of iov_r mm/zsmalloc.c: rework the list_add code in insert_zspage() mm/zswap: move to use crypto_acomp API for hardware acceleration mm/zswap: fix passing zero to 'PTR_ERR' warning mm/zswap: make struct kernel_param_ops definitions const userfaultfd/selftests: hint the test runner on required privilege userfaultfd/selftests: fix retval check for userfaultfd_open() userfaultfd/selftests: always dump something in modes userfaultfd: selftests: make __{s,u}64 format specifiers portable ...
2020-12-15	mm: forbid splitting special mappings	Dmitry Safonov	1	-10/+1
	Don't allow splitting of vm_special_mapping's. It affects vdso/vvar areas. Uprobes have only one page in xol_area so they aren't affected. Those restrictions were enforced by checks in .mremap() callbacks. Restrict resizing with generic .split() callback. Link: https://lkml.kernel.org/r/20201013013416.390574-7-dima@arista.com Signed-off-by: Dmitry Safonov <dima@arista.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Geffon <bgeffon@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15	s390/irq: Use irq_desc_kstat_cpu() in show_msi_interrupt()	Thomas Gleixner	1	-1/+1
	The irq descriptor is already there, no need to look it up again. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20201210194043.769108348@linutronix.de
2020-12-15	Merge tag 'irqchip-5.11' of ↵	Thomas Gleixner	7	-22/+25
	git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates for 5.11 from Marc Zyngier: - Preliminary support for managed interrupts on platform devices - Correctly identify allocation of MSIs proxyied by another device - Remove the fasteoi IPI flow which has been proved useless - Generalise the Ocelot support to new SoCs - Improve GICv4.1 vcpu entry, matching the corresponding KVM optimisation - Work around spurious interrupts on Qualcomm PDC - Random fixes and cleanups Link: https://lore.kernel.org/r/20201212135626.1479884-1-maz@kernel.org
2020-12-14	Merge tag 's390-5.11-1' of ↵	Linus Torvalds	21	-344/+184
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Add support for the hugetlb_cma command line option to allocate gigantic hugepages using CMA - Add arch_get_random_long() support. - Add ap bus userspace notifications. - Increase default size of vmalloc area to 512GB and otherwise let it increase dynamically by the size of physical memory. This should fix all occurrences where the vmalloc area was not large enough. - Completely get rid of set_fs() (aka select SET_FS) and rework address space handling while doing that; making address space handling much more simple. - Reimplement getcpu vdso syscall in C. - Add support for extended SCLP responses (> 4k). This allows e.g. to handle also potential large system configurations. - Simplify KASAN by removing 3-level page table support and only supporting 4-levels from now on. - Improve debug-ability of the kernel decompressor code, which now prints also stack traces and symbols in case of problems to the console. - Remove more power management leftovers. - Other various fixes and improvements all over the place. * tag 's390-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (62 commits) s390/mm: add support to allocate gigantic hugepages using CMA s390/crypto: add arch_get_random_long() support s390/smp: perform initial CPU reset also for SMT siblings s390/mm: use invalid asce for user space when switching to init_mm s390/idle: fix accounting with machine checks s390/idle: add missing mt_cycles calculation s390/boot: add build-id to decompressor s390/kexec_file: fix diag308 subcode when loading crash kernel s390/cio: fix use-after-free in ccw_device_destroy_console s390/cio: remove pm support from ccw bus driver s390/cio: remove pm support from css-bus driver s390/cio: remove pm support from IO subchannel drivers s390/cio: remove pm support from chsc subchannel driver s390/vmur: remove unused pm related functions s390/tape: remove unsupported PM functions s390/cio: remove pm support from eadm-sch drivers s390: remove pm support from console drivers s390/dasd: remove unused pm related functions s390/zfcp: remove pm support from zfcp driver s390/ap: let bus_register() add the AP bus sysfs attributes ...
2020-12-10	s390/mm: add support to allocate gigantic hugepages using CMA	Gerald Schaefer	1	-0/+3
	Commit cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma") added support for allocating gigantic hugepages using CMA, by specifying the hugetlb_cma= kernel parameter, which will disable any boot-time allocation of gigantic hugepages. This patch enables that option also for s390. Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-09	s390/smp: perform initial CPU reset also for SMT siblings	Sven Schnelle	1	-15/+3
	Not resetting the SMT siblings might leave them in unpredictable state. One of the observed problems was that the CPU timer wasn't reset and therefore large system time values where accounted during CPU bringup. Cc: <stable@kernel.org> # 4.0 Fixes: 10ad34bc76dfb ("s390: add SMT support") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-09	s390/idle: fix accounting with machine checks	Sven Schnelle	1	-6/+6
	When a machine check interrupt is triggered during idle, the code is using the async timer/clock for idle time calculation. It should use the machine check enter timer/clock which is passed to the macro. Fixes: 0b0ed657fe00 ("s390: remove critical section cleanup from entry.S") Cc: <stable@vger.kernel.org> # 5.8 Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-09	s390/idle: add missing mt_cycles calculation	Sven Schnelle	1	-9/+25
	During removal of the critical section cleanup the calculation of mt_cycles during idle was removed. This causes invalid accounting on systems with SMT enabled. Fixes: 0b0ed657fe00 ("s390: remove critical section cleanup from entry.S") Cc: <stable@vger.kernel.org> # 5.8 Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-02	sched/vtime: Consolidate IRQ time accounting	Frederic Weisbecker	1	-13/+32
	The 3 architectures implementing CONFIG_VIRT_CPU_ACCOUNTING_NATIVE all have their own version of irq time accounting that dispatch the cputime to the appropriate index: hardirq, softirq, system, idle, guest... from an all-in-one function. Instead of having these ad-hoc versions, move the cputime destination dispatch decision to the core code and leave only the actual per-index cputime accounting to the architecture. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201202115732.27827-4-frederic@kernel.org
2020-12-02	s390/vtime: Use the generic IRQ entry accounting	Frederic Weisbecker	1	-4/+0
	s390 has its own version of IRQ entry accounting because it doesn't account the idle time the same way the other architectures do. Only the actual idle sleep time is accounted as idle time, the rest of the idle task execution is accounted as system time. Make the generic IRQ entry accounting aware of architectures that have their own way of accounting idle time and convert s390 to use it. This prepares s390 to get involved in further consolidations of IRQ time accounting. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201202115732.27827-3-frederic@kernel.org
2020-12-02	sched/cputime: Remove symbol exports from IRQ time accounting	Frederic Weisbecker	1	-5/+5
	account_irq_enter_time() and account_irq_exit_time() are not called from modules. EXPORT_SYMBOL_GPL() can be safely removed from the IRQ cputime accounting functions called from there. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201202115732.27827-2-frederic@kernel.org
2020-12-02	s390: fix irq state tracing	Heiko Carstens	1	-15/+0
	With commit 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs tracing") common code calls arch_cpu_idle() with a lockdep state that tells irqs are on. This doesn't work very well for s390: psw_idle() will enable interrupts to wait for an interrupt. As soon as an interrupt occurs the interrupt handler will verify if the old context was psw_idle(). If that is the case the interrupt enablement bits in the old program status word will be cleared. A subsequent test in both the external as well as the io interrupt handler checks if in the old context interrupts were enabled. Due to the above patching of the old program status word it is assumed the old context had interrupts disabled, and therefore a call to TRACE_IRQS_OFF (aka trace_hardirqs_off_caller) is skipped. Which in turn makes lockdep incorrectly "think" that interrupts are enabled within the interrupt handler. Fix this by unconditionally calling TRACE_IRQS_OFF when entering interrupt handlers. Also call unconditionally TRACE_IRQS_ON when leaving interrupts handlers. This leaves the special psw_idle() case, which now returns with interrupts disabled, but has an "irqs on" lockdep state. So callers of psw_idle() must adjust the state on their own, if required. This is currently only __udelay_disabled(). Fixes: 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs tracing") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-30	s390/vdso: add missing prototypes for vdso functions	Heiko Carstens	3	-0/+16
	clang W=1 warns about missing prototypes: >> arch/s390/kernel/vdso64/getcpu.c:8:5: warning: no previous prototype for function '__s390_vdso_getcpu' [-Wmissing-prototypes] int __s390_vdso_getcpu(unsigned cpu, unsigned node, struct getcpu_cache *unused) ^ Add a local header file in order to get rid of this warnings. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-29	Merge tag 'locking-urgent-2020-11-29' of ↵	Linus Torvalds	1	-3/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "Two more places which invoke tracing from RCU disabled regions in the idle path. Similar to the entry path the low level idle functions have to be non-instrumentable" * tag 'locking-urgent-2020-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: intel_idle: Fix intel_idle() vs tracing sched/idle: Fix arch_cpu_idle() vs tracing
2020-11-27	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	1	-1/+8
	Pull kvm fixes from Paolo Bonzini: "ARM: - Fix alignment of the new HYP sections - Fix GICR_TYPER access from userspace S390: - do not reset the global diag318 data for per-cpu reset - do not mark memory as protected too early - fix for destroy page ultravisor call x86: - fix for SEV debugging - fix incorrect return code - fix for 'noapic' with PIC in userspace and LAPIC in kernel - fix for 5-level paging" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: x86/mmu: Fix get_mmio_spte() on CPUs supporting 5-level PT KVM: x86: Fix split-irqchip vs interrupt injection window request KVM: x86: handle !lapic_in_kernel case in kvm_cpu_*_extint MAINTAINERS: Update email address for Sean Christopherson MAINTAINERS: add uv.c also to KVM/s390 s390/uv: handle destroy page legacy interface KVM: arm64: vgic-v3: Drop the reporting of GICR_TYPER.Last for userspace KVM: SVM: fix error return code in svm_create_vcpu() KVM: SVM: Fix offset computation bug in __sev_dbg_decrypt(). KVM: arm64: Correctly align nVHE percpu data KVM: s390: remove diag318 reset code KVM: s390: pv: Mark mm as protected after the set secure parameters and improve cleanup
2020-11-24	Merge tag 's390-5.10-5' of ↵	Linus Torvalds	2	-5/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fix from Heiko Carstens: "Disable interrupts when restoring fpu and vector registers, otherwise KVM guests might see corrupted register contents" * tag 's390-5.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: fix fpu restore in entry.S
2020-11-24	sched/idle: Fix arch_cpu_idle() vs tracing	Peter Zijlstra	1	-3/+3
	We call arch_cpu_idle() with RCU disabled, but then use local_irq_{en,dis}able(), which invokes tracing, which relies on RCU. Switch all arch_cpu_idle() implementations to use raw_local_irq_{en,dis}able() and carefully manage the lockdep,rcu,tracing state like we do in entry. (XXX: we really should change arch_cpu_idle() to not return with interrupts enabled) Reported-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lkml.kernel.org/r/20201120114925.594122626@infradead.org
2020-11-23	s390/vdso: reimplement getcpu vdso syscall	Heiko Carstens	6	-2/+33
	Implement the previously removed getcpu vdso syscall by using the TOD programmable field to pass the cpu number to user space. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-23	s390/mm: add debug user asce support	Heiko Carstens	1	-0/+8
	Verify on exit to user space that always - the primary ASCE (cr1) is set to kernel ASCE - the secondary ASCE (cr7) is set to user ASCE If this is not the case: panic since something went terribly wrong. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-23	s390/mm: use invalid asce instead of kernel asce	Heiko Carstens	2	-2/+3
	Create a region 3 page table which contains only invalid entries, and use that via "s390_invalid_asce" instead of the kernel ASCE whenever there is either - no user address space available, e.g. during early startup - as an intermediate ASCE when address spaces are switched This makes sure that user space accesses in such situations are guaranteed to fail. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-23	s390/mm: remove set_fs / rework address space handling	Heiko Carstens	9	-170/+28
	Remove set_fs support from s390. With doing this rework address space handling and simplify it. As a result address spaces are now setup like this: CPU running in \| %cr1 ASCE \| %cr7 ASCE \| %cr13 ASCE ----------------------------\|-----------\|-----------\|----------- user space \| user \| user \| kernel kernel, normal execution \| kernel \| user \| kernel kernel, kvm guest execution \| gmap \| user \| kernel To achieve this the getcpu vdso syscall is removed in order to avoid secondary address mode and a separate vdso address space in for user space. The getcpu vdso syscall will be implemented differently with a subsequent patch. The kernel accesses user space always via secondary address space. This happens in different ways: - with mvcos in home space mode and directly read/write to secondary address space - with mvcs/mvcp in primary space mode and copy from primary space to secondary space or vice versa - with e.g. cs in secondary space mode and access secondary space Switching translation modes happens with sacf before and after instructions which access user space, like before. Lazy handling of control register reloading is removed in the hope to make everything simpler, but at the cost of making kernel entry and exit a bit slower. That is: on kernel entry the primary asce is always changed to contain the kernel asce, and on kernel exit the primary asce is changed again so it contains the user asce. In kernel mode there is only one exception to the primary asce: when kvm guests are executed the primary asce contains the gmap asce (which describes the guest address space). The primary asce is reset to kernel asce whenever kvm guest execution is interrupted, so that this doesn't has to be taken into account for any user space accesses. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-23	Merge branch 'fixes' into features	Heiko Carstens	2	-5/+7
	* fixes: s390: fix fpu restore in entry.S Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-23	s390: fix fpu restore in entry.S	Sven Schnelle	2	-5/+7
	We need to disable interrupts in load_fpu_regs(). Otherwise an interrupt might come in after the registers are loaded, but before CIF_FPU is cleared in load_fpu_regs(). When the interrupt returns, CIF_FPU will be cleared and the registers will never be restored. The entry.S code usually saves the interrupt state in __SF_EMPTY on the stack when disabling/restoring interrupts. sie64a however saves the pointer to the sie control block in __SF_SIE_CONTROL, which references the same location. This is non-obvious to the reader. To avoid thrashing the sie control block pointer in load_fpu_regs(), move the __SIE_* offsets eight bytes after __SF_EMPTY on the stack. Cc: <stable@vger.kernel.org> # 5.8 Fixes: 0b0ed657fe00 ("s390: remove critical section cleanup from entry.S") Reported-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20	s390/stp: let subsys_system_register() sysfs attributes	Julian Wiedmann	1	-30/+14
	Instead of creating the sysfs attributes for the stp root_dev by hand, pass them to subsys_system_register() as parameter. This also ensures that the attributes are available when the KOBJ_ADD event is raised. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20	s390/ftrace: assume -mhotpatch or -mrecord-mcount always available	Vasily Gorbik	2	-57/+14
	Currently the kernel minimal compiler requirement is gcc 4.9 or clang 10.0.1. * gcc -mhotpatch option is supported since 4.8. * A combination of -pg -mrecord-mcount -mnop-mcount -mfentry flags is supported since gcc 9 and since clang 10. Drop support for old -pg function prologues. Which leaves binary compatible -mhotpatch / -mnop-mcount -mfentry prologues in a form: brcl 0,0 Which are also do not require initial nop optimization / conversion and presence of _mcount symbol. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20	s390: unify identity mapping limits handling	Vasily Gorbik	1	-20/+17
	Currently we have to consider too many different values which in the end only affect identity mapping size. These are: 1. max_physmem_end - end of physical memory online or standby. Always <= end of the last online memory block (get_mem_detect_end()). 2. CONFIG_MAX_PHYSMEM_BITS - the maximum size of physical memory the kernel is able to support. 3. "mem=" kernel command line option which limits physical memory usage. 4. OLDMEM_BASE which is a kdump memory limit when the kernel is executed as crash kernel. 5. "hsa" size which is a memory limit when the kernel is executed during zfcp/nvme dump. Through out kernel startup and run we juggle all those values at once but that does not bring any amusement, only confusion and complexity. Unify all those values to a single one we should really care, that is our identity mapping size. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20	s390: add separate program check exit path	Heiko Carstens	1	-3/+11
	System call and program check handler both use the system call exit path when returning to previous context. However the program check handler jumps right to the end of the system call exit path if the previous context is kernel context. This lead to the quite odd double disabling of interrupts in the system call exit path introduced with commit ce9dfafe29be ("s390: fix system call exit path"). To avoid that have a separate program check handler exit path if the previous context is kernel context. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20	Merge branch 'fixes' into features	Heiko Carstens	2	-1/+3
	* fixes: s390/cpum_sf.c: fix file permission for cpum_sfb_size s390: update defconfigs s390: fix system call exit path Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-18	Merge tag 'kvm-s390-master-5.10-2' of ↵	Paolo Bonzini	1	-1/+8
	git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master KVM: s390: Fix for destroy page ultravisor call - handle response code from older firmware - add uv.c to KVM: s390/s390 maintainer list
2020-11-18	s390/uv: handle destroy page legacy interface	Christian Borntraeger	1	-1/+8
	Older firmware can return rc=0x107 rrc=0xd for destroy page if the page is already non-secure. This should be handled like a success as already done by newer firmware. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: 1a80b54d1ce1 ("s390/uv: add destroy page call") Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
2020-11-17	Merge tag 's390-5.10-4' of ↵	Linus Torvalds	2	-1/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - fix system call exit path; avoid return to user space with any TIF/CIF/PIF set - fix file permission for cpum_sfb_size parameter - another small defconfig update * tag 's390-5.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cpum_sf.c: fix file permission for cpum_sfb_size s390: update defconfigs s390: fix system call exit path
2020-11-13	ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regs	Steven Rostedt (VMware)	1	-1/+3
	In preparation to have arguments of a function passed to callbacks attached to functions as default, change the default callback prototype to receive a struct ftrace_regs as the forth parameter instead of a pt_regs. For callbacks that set the FL_SAVE_REGS flag in their ftrace_ops flags, they will now need to get the pt_regs via a ftrace_get_regs() helper call. If this is called by a callback that their ftrace_ops did not have a FL_SAVE_REGS flag set, it that helper function will return NULL. This will allow the ftrace_regs to hold enough just to get the parameters and stack pointer, but without the worry that callbacks may have a pt_regs that is not completely filled. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-12	s390/cpum_sf.c: fix file permission for cpum_sfb_size	Thomas Richter	1	-1/+1
	This file is installed by the s390 CPU Measurement sampling facility device driver to export supported minimum and maximum sample buffer sizes. This file is read by lscpumf tool to display the details of the device driver capabilities. The lscpumf tool might be invoked by a non-root user. In this case it does not print anything because the file contents can not be read. Fix this by allowing read access for all users. Reading the file contents is ok, changing the file contents is left to the root user only. For further reference and details see: [1] https://github.com/ibm-s390-tools/s390-tools/issues/97 Fixes: 69f239ed335a ("s390/cpum_sf: Dynamically extend the sampling buffer if overflows occur") Cc: <stable@vger.kernel.org> # 3.14 Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09	perf/arch: Remove perf_sample_data::regs_user_copy	Peter Zijlstra	1	-2/+1
	struct perf_sample_data lives on-stack, we should be careful about it's size. Furthermore, the pt_regs copy in there is only because x86_64 is a trainwreck, solve it differently. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20201030151955.258178461@infradead.org