summaryrefslogtreecommitdiffstats
path: root/arch/mips/kernel
AgeCommit message (Collapse)AuthorFilesLines
2021-07-09Merge tag 'irqchip-fixes-5.14-1' of ↵Thomas Gleixner1-0/+16
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Fix a MIPS bug where irqdomain loopkups could occur in a context where RCU is not allowed - Fix a documentation bug for handle_domain_irq
2021-07-09irqchip/mips: Fix RCU violation when using irqdomain lookup on interrupt entryMarc Zyngier1-0/+16
Since d4a45c68dc81 ("irqdomain: Protect the linear revmap with RCU"), any irqdomain lookup requires the RCU read lock to be held. This assumes that the architecture code will be structured such as irq_enter() will be called *before* the interrupt is looked up in the irq domain. However, this isn't the case for MIPS, and a number of drivers are structured to do it the other way around when handling an interrupt in their root irqchip (secondary irqchips are OK by construction). This results in a RCU splat on a lockdep-enabled kernel when the kernel takes an interrupt from idle, as reported by Guenter Roeck. Note that this could have fired previously if any driver had used tree-based irqdomain, which always had the RCU requirement. To solve this, provide a MIPS-specific helper (do_domain_IRQ()) as the pendent of do_IRQ() that will do thing in the right order (and maybe save some cycles in the process). Ideally, MIPS would be moved over to using handle_domain_irq(), but that's much more ambitious. Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> [maz: add dependency on CONFIG_IRQ_DOMAIN after report from the kernelci bot] Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/r/20210705172352.GA56304@roeck-us.net Link: https://lore.kernel.org/r/20210706110647.3979002-1-maz@kernel.org
2021-07-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-0/+1
Merge more updates from Andrew Morton: "190 patches. Subsystems affected by this patch series: mm (hugetlb, userfaultfd, vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock, migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap, zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc, core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs, signals, exec, kcov, selftests, compress/decompress, and ipc" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits) ipc/util.c: use binary search for max_idx ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock ipc: use kmalloc for msg_queue and shmid_kernel ipc sem: use kvmalloc for sem_undo allocation lib/decompressors: remove set but not used variabled 'level' selftests/vm/pkeys: exercise x86 XSAVE init state selftests/vm/pkeys: refill shadow register after implicit kernel write selftests/vm/pkeys: handle negative sys_pkey_alloc() return code selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random kcov: add __no_sanitize_coverage to fix noinstr for all architectures exec: remove checks in __register_bimfmt() x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned hfsplus: report create_date to kstat.btime hfsplus: remove unnecessary oom message nilfs2: remove redundant continue statement in a while-loop kprobes: remove duplicated strong free_insn_page in x86 and s390 init: print out unknown kernel parameters checkpatch: do not complain about positive return values starting with EPOLL checkpatch: improve the indented label test checkpatch: scripts/spdxcheck.py now requires python3 ...
2021-07-01Merge tag 'mips_5.14' of ↵Linus Torvalds3-31/+10
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: - add support for OpeneEmbed SOM9331 board - Ingenic fixes/improvments - other fixes and cleanups * tag 'mips_5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (39 commits) MIPS: Fix PKMAP with 32-bit MIPS huge page support MIPS: CI20: Add second percpu timer for SMP. MIPS: CI20: Reduce clocksource to 750 kHz. MIPS: Ingenic: Add MAC syscon nodes for Ingenic SoCs. dt-bindings: clock: Add documentation for MAC PHY control bindings. MIPS: X1830: Respect cell count of common properties. MIPS: set mips32r5 for virt extensions MIPS: loongsoon64: Reserve memory below starting pfn to prevent Oops MIPS: MT extensions are not available on MIPS32r1 mips/kvm: Use BUG_ON instead of if condition followed by BUG MIPS: OCTEON: octeon-usb: Use devm_platform_get_and_ioremap_resource() MIPS: add PMD table accounting into MIPS'pmd_alloc_one MIPS: Loongson64: fix spelling of SPDX tag MIPS: ingenic: rs90: Add dedicated VRAM memory region MIPS: ingenic: gcw0: Set codec to cap-less mode for FM radio MIPS: ingenic: jz4780: Fix I2C nodes to match DT doc MIPS: ingenic: Select CPU_SUPPORTS_CPUFREQ && MIPS_EXTERNAL_TIMER MIPS: Kconfig: ingenic: Ensure MACH_INGENIC_GENERIC selects all SoCs MIPS: cpu-probe: Fix FPU detection on Ingenic JZ4760(B) MIPS: boot: Support specifying UART port on Ingenic SoCs ...
2021-07-01Merge tag 'fs_for_v5.14-rc1' of ↵Linus Torvalds3-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull misc fs updates from Jan Kara: "The new quotactl_fd() syscall (remake of quotactl_path() syscall that got introduced & disabled in 5.13 cycle), and couple of udf, reiserfs, isofs, and writeback fixes and cleanups" * tag 'fs_for_v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: writeback: fix obtain a reference to a freeing memcg css quota: remove unnecessary oom message isofs: remove redundant continue statement quota: Wire up quotactl_fd syscall quota: Change quotactl_path() systcall to an fd-based one reiserfs: Remove unneed check in reiserfs_write_full_page() udf: Fix NULL pointer dereference in udf_symlink function reiserfs: add check for invalid 1st journal block
2021-07-01kernel.h: split out panic and oops helpersAndy Shevchenko1-0/+1
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out panic and oops helpers. There are several purposes of doing this: - dropping dependency in bug.h - dropping a loop by moving out panic_notifier.h - unload kernel.h from something which has its own domain At the same time convert users tree-wide to use new headers, although for the time being include new header back to kernel.h to avoid twisted indirected includes for existing users. [akpm@linux-foundation.org: thread_info.h needs limits.h] [andriy.shevchenko@linux.intel.com: ia64 fix] Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Co-developed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Sebastian Reichel <sre@kernel.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Stephen Boyd <sboyd@kernel.org> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-3/+1
Merge misc updates from Andrew Morton: "191 patches. Subsystems affected by this patch series: kthread, ia64, scripts, ntfs, squashfs, ocfs2, kernel/watchdog, and mm (gup, pagealloc, slab, slub, kmemleak, dax, debug, pagecache, gup, swap, memcg, pagemap, mprotect, bootmem, dma, tracing, vmalloc, kasan, initialization, pagealloc, and memory-failure)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (191 commits) mm,hwpoison: make get_hwpoison_page() call get_any_page() mm,hwpoison: send SIGBUS with error virutal address mm/page_alloc: split pcp->high across all online CPUs for cpuless nodes mm/page_alloc: allow high-order pages to be stored on the per-cpu lists mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA docs: remove description of DISCONTIGMEM arch, mm: remove stale mentions of DISCONIGMEM mm: remove CONFIG_DISCONTIGMEM m68k: remove support for DISCONTIGMEM arc: remove support for DISCONTIGMEM arc: update comment about HIGHMEM implementation alpha: remove DISCONTIGMEM and NUMA mm/page_alloc: move free_the_page mm/page_alloc: fix counting of managed_pages mm/page_alloc: improve memmap_pages dbg msg mm: drop SECTION_SHIFT in code comments mm/page_alloc: introduce vm.percpu_pagelist_high_fraction mm/page_alloc: limit the number of pages on PCP lists when reclaim is active mm/page_alloc: scale the number of pages that are batch freed ...
2021-06-29arch/mips/kernel/traps: use vma_lookup() instead of find_vma()Liam Howlett1-3/+1
Use vma_lookup() to find the VMA at a specific address. As vma_lookup() will return NULL if the address is not within any VMA, the start address no longer needs to be validated. Link: https://lkml.kernel.org/r/20210521174745.2219620-8-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-28Merge tag 'sched-core-2021-06-28' of ↵Linus Torvalds3-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler udpates from Ingo Molnar: - Changes to core scheduling facilities: - Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables coordinated scheduling across SMT siblings. This is a much requested feature for cloud computing platforms, to allow the flexible utilization of SMT siblings, without exposing untrusted domains to information leaks & side channels, plus to ensure more deterministic computing performance on SMT systems used by heterogenous workloads. There are new prctls to set core scheduling groups, which allows more flexible management of workloads that can share siblings. - Fix task->state access anti-patterns that may result in missed wakeups and rename it to ->__state in the process to catch new abuses. - Load-balancing changes: - Tweak newidle_balance for fair-sched, to improve 'memcache'-like workloads. - "Age" (decay) average idle time, to better track & improve workloads such as 'tbench'. - Fix & improve energy-aware (EAS) balancing logic & metrics. - Fix & improve the uclamp metrics. - Fix task migration (taskset) corner case on !CONFIG_CPUSET. - Fix RT and deadline utilization tracking across policy changes - Introduce a "burstable" CFS controller via cgroups, which allows bursty CPU-bound workloads to borrow a bit against their future quota to improve overall latencies & batching. Can be tweaked via /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us. - Rework assymetric topology/capacity detection & handling. - Scheduler statistics & tooling: - Disable delayacct by default, but add a sysctl to enable it at runtime if tooling needs it. Use static keys and other optimizations to make it more palatable. - Use sched_clock() in delayacct, instead of ktime_get_ns(). - Misc cleanups and fixes. * tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) sched/doc: Update the CPU capacity asymmetry bits sched/topology: Rework CPU capacity asymmetry detection sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flag psi: Fix race between psi_trigger_create/destroy sched/fair: Introduce the burstable CFS controller sched/uclamp: Fix uclamp_tg_restrict() sched/rt: Fix Deadline utilization tracking during policy change sched/rt: Fix RT utilization tracking during policy change sched: Change task_struct::state sched,arch: Remove unused TASK_STATE offsets sched,timer: Use __set_current_state() sched: Add get_current_state() sched,perf,kvm: Fix preemption condition sched: Introduce task_is_running() sched: Unbreak wakeups sched/fair: Age the average idle time sched/cpufreq: Consider reduced CPU capacity in energy calculation sched/fair: Take thermal pressure into account while estimating energy thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure sched/fair: Return early from update_tg_cfs_load() if delta == 0 ...
2021-06-28Merge tag 'perf-core-2021-06-28' of ↵Linus Torvalds1-3/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: - Platform PMU driver updates: - x86 Intel uncore driver updates for Skylake (SNR) and Icelake (ICX) servers - Fix RDPMC support - Fix [extended-]PEBS-via-PT support - Fix Sapphire Rapids event constraints - Fix :ppp support on Sapphire Rapids - Fix fixed counter sanity check on Alder Lake & X86_FEATURE_HYBRID_CPU - Other heterogenous-PMU fixes - Kprobes: - Remove the unused and misguided kprobe::fault_handler callbacks. - Warn about kprobes taking a page fault. - Fix the 'nmissed' stat counter. - Misc cleanups and fixes. * tag 'perf-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix task context PMU for Hetero perf/x86/intel: Fix instructions:ppp support in Sapphire Rapids perf/x86/intel: Add more events requires FRONTEND MSR on Sapphire Rapids perf/x86/intel: Fix fixed counter check warning for some Alder Lake perf/x86/intel: Fix PEBS-via-PT reload base value for Extended PEBS perf/x86: Reset the dirty counter to prevent the leak for an RDPMC task kprobes: Do not increment probe miss count in the fault handler x86,kprobes: WARN if kprobes tries to handle a fault kprobes: Remove kprobe::fault_handler uprobes: Update uprobe_write_opcode() kernel-doc comment perf/hw_breakpoint: Fix DocBook warnings in perf hw_breakpoint perf/core: Fix DocBook warnings perf/core: Make local function perf_pmu_snapshot_aux() static perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on ICX perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on SNR perf/x86/intel/uncore: Generalize I/O stacks to PMON mapping procedure perf/x86/intel/uncore: Drop unnecessary NULL checks after container_of()
2021-06-28Merge tag 'locking-core-2021-06-28' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Core locking & atomics: - Convert all architectures to ARCH_ATOMIC: move every architecture to ARCH_ATOMIC, then get rid of ARCH_ATOMIC and all the transitory facilities and #ifdefs. Much reduction in complexity from that series: 63 files changed, 756 insertions(+), 4094 deletions(-) - Self-test enhancements - Futexes: - Add the new FUTEX_LOCK_PI2 ABI, which is a variant that doesn't set FLAGS_CLOCKRT (.e. uses CLOCK_MONOTONIC). [ The temptation to repurpose FUTEX_LOCK_PI's implicit setting of FLAGS_CLOCKRT & invert the flag's meaning to avoid having to introduce a new variant was resisted successfully. ] - Enhance futex self-tests - Lockdep: - Fix dependency path printouts - Optimize trace saving - Broaden & fix wait-context checks - Misc cleanups and fixes. * tag 'locking-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) locking/lockdep: Correct the description error for check_redundant() futex: Provide FUTEX_LOCK_PI2 to support clock selection futex: Prepare futex_lock_pi() for runtime clock selection lockdep/selftest: Remove wait-type RCU_CALLBACK tests lockdep/selftests: Fix selftests vs PROVE_RAW_LOCK_NESTING lockdep: Fix wait-type for empty stack locking/selftests: Add a selftest for check_irq_usage() lockding/lockdep: Avoid to find wrong lock dep path in check_irq_usage() locking/lockdep: Remove the unnecessary trace saving locking/lockdep: Fix the dep path printing for backwards BFS selftests: futex: Add futex compare requeue test selftests: futex: Add futex wait test seqlock: Remove trailing semicolon in macros locking/lockdep: Reduce LOCKDEP dependency list locking/lockdep,doc: Improve readability of the block matrix locking/atomics: atomic-instrumented: simplify ifdeffery locking/atomic: delete !ARCH_ATOMIC remnants locking/atomic: xtensa: move to ARCH_ATOMIC locking/atomic: sparc: move to ARCH_ATOMIC locking/atomic: sh: move to ARCH_ATOMIC ...
2021-06-18sched,arch: Remove unused TASK_STATE offsetsPeter Zijlstra1-1/+0
All 6 architectures define TASK_STATE in asm-offsets, but then never actually use it. Remove the definitions to make sure they never will. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210611082838.472811363@infradead.org
2021-06-18sched: Introduce task_is_running()Peter Zijlstra1-1/+1
Replace a bunch of 'p->state == TASK_RUNNING' with a new helper: task_is_running(p). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210611082838.222401495@infradead.org
2021-06-07quota: Wire up quotactl_fd syscallJan Kara3-3/+3
Wire up the quotactl_fd syscall. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2021-06-03Merge branch 'sched/urgent' into sched/core, to pick up fixesIngo Molnar3-3/+3
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-06-01kprobes: Remove kprobe::fault_handlerPeter Zijlstra1-3/+0
The reason for kprobe::fault_handler(), as given by their comment: * We come here because instructions in the pre/post * handler caused the page_fault, this could happen * if handler tries to access user space by * copy_from_user(), get_user() etc. Let the * user-specified handler try to fix it first. Is just plain bad. Those other handlers are ran from non-preemptible context and had better use _nofault() functions. Also, there is no upstream usage of this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20210525073213.561116662@infradead.org
2021-06-01MIPS: cpu-probe: Fix FPU detection on Ingenic JZ4760(B)Paul Cercueil1-0/+5
Ingenic JZ4760 and JZ4760B do have a FPU, but the config registers don't report it. Force the FPU detection in case the processor ID match the JZ4760(B) one. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-06-01mips: syscalls: use pattern rules to generate syscall headersMasahiro Yamada1-24/+4
Use pattern rules to unify similar build rules among n32, n64, and o32. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-06-01mips: syscalls: define syscall offsets directly in <asm/unistd.h>Masahiro Yamada2-7/+1
There is no good reason to generate the syscall offset macros by scripting since they are not derived from the syscall tables. Define __NR_*_Linux macros directly in arch/mips/include/asm/unistd.h, and clean up the Makefile and the shell script. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-05-26locking/atomic: mips: move to ARCH_ATOMICMark Rutland1-2/+2
We'd like all architectures to convert to ARCH_ATOMIC, as once all architectures are converted it will be possible to make significant cleanups to the atomics headers, and this will make it much easier to generically enable atomic functionality (e.g. debug logic in the instrumented wrappers). As a step towards that, this patch migrates mips to ARCH_ATOMIC. The arch code provides arch_{atomic,atomic64,xchg,cmpxchg}*(), and common code wraps these with optional instrumentation to provide the regular functions. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Will Deacon <will@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210525140232.53872-23-mark.rutland@arm.com
2021-05-17quota: Disable quotactl_path syscallJan Kara3-3/+3
In commit fa8b90070a80 ("quota: wire up quotactl_path") we have wired up new quotactl_path syscall. However some people in LWN discussion have objected that the path based syscall is missing dirfd and flags argument which is mostly standard for contemporary path based syscalls. Indeed they have a point and after a discussion with Christian Brauner and Sascha Hauer I've decided to disable the syscall for now and update its API. Since there is no userspace currently using that syscall and it hasn't been released in any major release, we should be fine. CC: Christian Brauner <christian.brauner@ubuntu.com> CC: Sascha Hauer <s.hauer@pengutronix.de> Link: https://lore.kernel.org/lkml/20210512153621.n5u43jsytbik4yze@wittgenstein Signed-off-by: Jan Kara <jack@suse.cz>
2021-05-12sched/core: Initialize the idle task with preemption disabledValentin Schneider1-1/+0
As pointed out by commit de9b8f5dcbd9 ("sched: Fix crash trying to dequeue/enqueue the idle thread") init_idle() can and will be invoked more than once on the same idle task. At boot time, it is invoked for the boot CPU thread by sched_init(). Then smp_init() creates the threads for all the secondary CPUs and invokes init_idle() on them. As the hotplug machinery brings the secondaries to life, it will issue calls to idle_thread_get(), which itself invokes init_idle() yet again. In this case it's invoked twice more per secondary: at _cpu_up(), and at bringup_cpu(). Given smp_init() already initializes the idle tasks for all *possible* CPUs, no further initialization should be required. Now, removing init_idle() from idle_thread_get() exposes some interesting expectations with regards to the idle task's preempt_count: the secondary startup always issues a preempt_disable(), requiring some reset of the preempt count to 0 between hot-unplug and hotplug, which is currently served by idle_thread_get() -> idle_init(). Given the idle task is supposed to have preemption disabled once and never see it re-enabled, it seems that what we actually want is to initialize its preempt_count to PREEMPT_DISABLED and leave it there. Do that, and remove init_idle() from idle_thread_get(). Secondary startups were patched via coccinelle: @begone@ @@ -preempt_disable(); ... cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210512094636.2958515-1-valentin.schneider@arm.com
2021-05-01Merge tag 'landlock_v34' of ↵Linus Torvalds3-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull Landlock LSM from James Morris: "Add Landlock, a new LSM from Mickaël Salaün. Briefly, Landlock provides for unprivileged application sandboxing. From Mickaël's cover letter: "The goal of Landlock is to enable to restrict ambient rights (e.g. global filesystem access) for a set of processes. Because Landlock is a stackable LSM [1], it makes possible to create safe security sandboxes as new security layers in addition to the existing system-wide access-controls. This kind of sandbox is expected to help mitigate the security impact of bugs or unexpected/malicious behaviors in user-space applications. Landlock empowers any process, including unprivileged ones, to securely restrict themselves. Landlock is inspired by seccomp-bpf but instead of filtering syscalls and their raw arguments, a Landlock rule can restrict the use of kernel objects like file hierarchies, according to the kernel semantic. Landlock also takes inspiration from other OS sandbox mechanisms: XNU Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil. In this current form, Landlock misses some access-control features. This enables to minimize this patch series and ease review. This series still addresses multiple use cases, especially with the combined use of seccomp-bpf: applications with built-in sandboxing, init systems, security sandbox tools and security-oriented APIs [2]" The cover letter and v34 posting is here: https://lore.kernel.org/linux-security-module/20210422154123.13086-1-mic@digikod.net/ See also: https://landlock.io/ This code has had extensive design discussion and review over several years" Link: https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler-ca.com/ [1] Link: https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.net/ [2] * tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: landlock: Enable user space to infer supported features landlock: Add user and kernel documentation samples/landlock: Add a sandbox manager example selftests/landlock: Add user space tests landlock: Add syscall implementations arch: Wire up Landlock syscalls fs,security: Add sb_delete hook landlock: Support filesystem access-control LSM: Infrastructure management of the superblock landlock: Add ptrace restrictions landlock: Set up the security framework and manage credentials landlock: Add ruleset and domain management landlock: Add object management
2021-04-29Merge tag 'mips_5.13' of ↵Linus Torvalds20-346/+189
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: - removed get_fs/set_fs - removed broken/unmaintained MIPS KVM trap and emulate support - added support for Loongson-2K1000 - fixes and cleanups * tag 'mips_5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (107 commits) MIPS: BCM63XX: Use BUG_ON instead of condition followed by BUG. MIPS: select ARCH_KEEP_MEMBLOCK unconditionally mips: Do not include hi and lo in clobber list for R6 MIPS:DTS:Correct the license for Loongson-2K MIPS:DTS:Fix label name and interrupt number of ohci for Loongson-2K MIPS: Avoid handcoded DIVU in `__div64_32' altogether lib/math/test_div64: Correct the spelling of "dividend" lib/math/test_div64: Fix error message formatting mips/bootinfo:correct some comments of fw_arg MIPS: Avoid DIVU in `__div64_32' is result would be zero MIPS: Reinstate platform `__div64_32' handler div64: Correct inline documentation for `do_div' lib/math: Add a `do_div' test module MIPS: Makefile: Replace -pg with CC_FLAGS_FTRACE MIPS: pci-legacy: revert "use generic pci_enable_resources" MIPS: Loongson64: Add kexec/kdump support MIPS: pci-legacy: use generic pci_enable_resources MIPS: pci-legacy: remove busn_resource field MIPS: pci-legacy: remove redundant info messages MIPS: pci-legacy: stop using of_pci_range_to_resource ...
2021-04-29Merge tag 'for_v5.13-rc1' of ↵Linus Torvalds3-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota, ext2, reiserfs updates from Jan Kara: - support for path (instead of device) based quotactl syscall (quotactl_path(2)) - ext2 conversion to kmap_local() - other minor cleanups & fixes * tag 'for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fs/reiserfs/journal.c: delete useless variables fs/ext2: Replace kmap() with kmap_local_page() ext2: Match up ext2_put_page() with ext2_dotdot() and ext2_find_entry() fs/ext2/: fix misspellings using codespell tool quota: report warning limits for realtime space quotas quota: wire up quotactl_path quota: Add mountpath based quota support
2021-04-22arch: Wire up Landlock syscallsMickaël Salaün3-0/+9
Wire up the following system calls for all architectures: * landlock_create_ruleset(2) * landlock_add_rule(2) * landlock_restrict_self(2) Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-10-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-21MIPS: Makefile: Replace -pg with CC_FLAGS_FTRACEzhaoxiao1-4/+4
This patch replaces the "open-coded" -pg compile flag with a CC_FLAGS_FTRACE makefile variable which architectures can override if a different option should be used for code generation. Signed-off-by: zhaoxiao <zhaoxiao@uniontech.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-04-16MIPS: Loongson64: Add kexec/kdump supportHuacai Chen1-5/+4
Add kexec/kdump support for Loongson64 by: 1, Provide Loongson-specific kexec functions: loongson_kexec_prepare(), loongson_kexec_shutdown() and loongson_crash_shutdown(); 2, Provide Loongson-specific assembly code in kexec_smp_wait(); To start Loongson64, The boot CPU needs 3 parameters: fw_arg0: the number of arguments in cmdline (i.e., argc). fw_arg1: structure holds cmdline such as "root=/dev/sda1 console=tty" (i.e., argv). fw_arg2: environment (i.e., envp, additional boot parameters from LEFI). Non-boot CPUs do not need one parameter as the IPI mailbox base address. They query their own IPI mailbox to get PC, SP and GP in a loopi, until the boot CPU brings them up. loongson_kexec_prepare(): Setup cmdline for kexec/kdump. The kexec/kdump cmdline comes from kexec's "append" option string. This structure will be parsed in fw_init_cmdline() of arch/mips/fw/lib/cmdline.c. Both image ->control_code_page and the cmdline need to be in a safe memory region (memory allocated by the old kernel may be corrupted by the new kernel). In order to maintain compatibility for the old firmware, the low 2MB is reserverd and safe for Loongson. So let KEXEC_CTRL_CODE and KEXEC_ARGV_ ADDR be here. LEFI parameters may be corrupted at runtime, so backup it at mips_reboot_setup(), and then restore it at loongson_kexec_shutdown() /loongson_crash_shutdown(). loongson_kexec_shutdown(): Wake up all present CPUs and let them go to reboot_code_buffer. Pass the kexec parameters to kexec_args. loongson_crash_shutdown(): Pass the kdump parameters to kexec_args. The assembly part in kexec_smp_wait provide a routine as BIOS does, in order to keep secondary CPUs in a querying loop. The layout of low 2MB memory in our design: 0x80000000, the first MB, the first 64K, Exception vectors 0x80010000, the first MB, the second 64K, STR (suspend) data 0x80020000, the first MB, the third and fourth 64K, UEFI HOB 0x80040000, the first MB, the fifth 64K, RT-Thread for SMC 0x80100000, the second MB, the first 64K, KEXEC code 0x80108000, the second MB, the second 64K, KEXEC data Cc: Eric Biederman <ebiederm@xmission.com> Tested-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@kernel.org> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-04-07MIPS: Fix new sparse warningsThomas Bogendoerfer2-6/+7
Commit 45deb5faeb9e ("MIPS: uaccess: Remove get_fs/set_fs call sites") caused a few new sparse warnings, fix them. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-04-06MIPS: Loongson64: Use _CACHE_UNCACHED instead of _CACHE_UNCACHED_ACCELERATEDTiezhu Yang1-3/+0
Loongson64 processors have a writecombine issue that maybe failed to write back framebuffer used with ATI Radeon or AMD GPU at times, after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine mapping for MIPS"), there exists some errors such as blurred screen and lockup, and so on. [ 60.958721] radeon 0000:03:00.0: ring 0 stalled for more than 10079msec [ 60.965315] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000112 last fence id 0x000000000000011d on ring 0) [ 60.976525] radeon 0000:03:00.0: ring 3 stalled for more than 10086msec [ 60.983156] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000374 last fence id 0x00000000000003a8 on ring 3) As discussed earlier [1], it might be better to disable writecombine on the CPU detection side because the root cause is unknown now. Actually, this patch is a temporary solution to just make it work well, it is not a proper and final solution, I hope someone will have a better solution to fix this issue in the future. [1] https://lore.kernel.org/patchwork/patch/1285542/ Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-04-06MIPS: Remove get_fs/set_fsThomas Bogendoerfer3-6/+1
All get_fs/set_fs calls in MIPS code are gone, so remove implementation of it. With the clear separation of user/kernel space access we no longer need the EVA special handling, so get rid of that, too. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-04-06MIPS: uaccess: Remove get_fs/set_fs call sitesThomas Bogendoerfer3-186/+136
Use new helpers to access user/kernel for functions, which are used with user/kernel pointers. Instead of dealing with get_fs/set_fs select user/kernel access via parameter. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-04-06MIPS: kernel: Remove not needed set_fs callsThomas Bogendoerfer1-8/+0
flush_icache_range always does flush kernel address ranges, so no need to do the set_fs dance. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-04-06MIPS: Add support for CONFIG_DEBUG_VIRTUALFlorian Fainelli1-2/+3
Provide hooks to intercept bad usages of virt_to_phys() and __pa_symbol() throughout the kernel. To make this possible, we need to rename the current implement of virt_to_phys() into __virt_to_phys_nodebug() and wrap it around depending on CONFIG_DEBUG_VIRTUAL. A similar thing is needed for __pa_symbol() which is now aliased to __phys_addr_symbol() whose implementation is either the direct return of RELOC_HIDE or goes through the debug version. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-30MIPS: kernel: setup.c: fix compilation errorMauri Sandberg1-1/+1
With ath79_defconfig enabling CONFIG_MIPS_ELF_APPENDED_DTB gives a compilation error. This patch fixes it. Build log: ... CC kernel/locking/percpu-rwsem.o ../arch/mips/kernel/setup.c:46:39: error: conflicting types for '__appended_dtb' const char __section(".appended_dtb") __appended_dtb[0x100000]; ^~~~~~~~~~~~~~ In file included from ../arch/mips/kernel/setup.c:34: ../arch/mips/include/asm/bootinfo.h:118:13: note: previous declaration of '__appended_dtb' was here extern char __appended_dtb[]; ^~~~~~~~~~~~~~ CC fs/attr.o make[4]: *** [../scripts/Makefile.build:271: arch/mips/kernel/setup.o] Error 1 ... Root cause seems to be: Fixes: b83ba0b9df56 ("MIPS: of: Introduce helper function to get DTB") Signed-off-by: Mauri Sandberg <sandberg@mailfence.com> Reviewed-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: trivial@kernel.org Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-17quota: wire up quotactl_pathSascha Hauer3-0/+3
Wire up the quotactl_path syscall added in the previous patch. Link: https://lore.kernel.org/r/20210304123541.30749-3-s.hauer@pengutronix.de Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2021-03-16MIPS: vmlinux.lds.S: Fix appended dtb not properly alignedPaul Cercueil1-1/+1
Commit 6654111c893f ("MIPS: vmlinux.lds.S: align raw appended dtb to 8 bytes") changed the alignment from STRUCT_ALIGNMENT bytes to 8 bytes. The commit's message makes it sound like it was actually done on purpose, but this is not the case. The commit was written when raw appended dtb were not aligned at all. The STRUCT_ALIGN() was added a few days before, in commit 7a05293af39f ("MIPS: boot/compressed: Copy DTB to aligned address"). The true purpose of the commit was not to align specifically to 8 bytes, but to make sure that the generated vmlinux' size was properly padded to the alignment required for DTBs. While the switch to 8-byte alignment worked for vmlinux-appended dtb blobs, it broke vmlinuz-appended dtb blobs, as the decompress routine moves the blob to a STRUCT_ALIGNMENT aligned address. Fix this by changing the raw appended dtb blob alignment from 8 bytes back to STRUCT_ALIGNMENT bytes in vmlinux.lds.S. Fixes: 6654111c893f ("MIPS: vmlinux.lds.S: align raw appended dtb to 8 bytes") Cc: Bjørn Mork <bjorn@mork.no> Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-14mips: kernel: use DEFINE_DEBUGFS_ATTRIBUTE with debugfs_create_file_unsafe()Wang Qing1-4/+4
debugfs_create_file_unsafe does not protect the fops handed to it against file removal. DEFINE_DEBUGFS_ATTRIBUTE makes the fops aware of the file lifetime and thus protects it against removal. Signed-off-by: Wang Qing <wangqing@vivo.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-10mips: syscalls: switch to generic syscallhdr.shMasahiro Yamada2-44/+2
Many architectures duplicate similar shell scripts. This commit converts mips to use scripts/syscallhdr.sh. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-10mips: syscalls: switch to generic syscalltbl.shMasahiro Yamada6-67/+14
Many architectures duplicate similar shell scripts. This commit converts mips to use scripts/syscalltbl.sh. This also unifies syscall_table_32_o32.h and syscall_table_64_o32.h into syscall_table_o32.h. The offset parameters are unneeded here; __SYSCALL(nr, entry) is defined as 'PTR entry', so the parameter 'nr' is not used in the first place. With this commit, syscall tables and generated files are straight mapped, which makes things easier to understand. syscall_n32.tbl --> syscall_table_n32.h syscall_n64.tbl --> syscall_table_n64.h syscall_o32.tbl --> syscall_table_o32.h Then, the abi parameters are also unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-10MIPS: Remove KVM_GUEST supportThomas Bogendoerfer1-4/+0
KVM_GUEST is broken and unmaintained, so let's remove it. Reviewed-by: Huacai Chen <chenhuacai@kernel.org> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-10Merge tag 'mips-fixes_5.12_1' into mips-nextThomas Bogendoerfer4-5/+20
- fixes for boot breakage because of misaligned FDTs - fix for overwritten exception handlers - enable MIPS optimized crypto for all MIPS CPUs to improve wireguard performance
2021-03-09MIPS: kernel: Reserve exception base early to prevent corruptionThomas Bogendoerfer3-5/+14
BMIPS is one of the few platforms that do change the exception base. After commit 2dcb39645441 ("memblock: do not start bottom-up allocations with kernel_end") we started seeing BMIPS boards fail to boot with the built-in FDT being corrupted. Before the cited commit, early allocations would be in the [kernel_end, RAM_END] range, but after commit they would be within [RAM_START + PAGE_SIZE, RAM_END]. The custom exception base handler that is installed by bmips_ebase_setup() done for BMIPS5000 CPUs ends-up trampling on the memory region allocated by unflatten_and_copy_device_tree() thus corrupting the FDT used by the kernel. To fix this, we need to perform an early reservation of the custom exception space. Additional we reserve the first 4k (1k for R3k) for either normal exception vector space (legacy CPUs) or special vectors like cache exceptions. Huge thanks to Serge for analysing and proposing a solution to this issue. Fixes: 2dcb39645441 ("memblock: do not start bottom-up allocations with kernel_end") Reported-by: Kamal Dasu <kdasu.kdev@gmail.com> Debugged-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-08MIPS: vmlinux.lds.S: align raw appended dtb to 8 bytesBjørn Mork1-1/+5
The devicetree specification requires 8-byte alignment in memory. This is now enforced by libfdt since commit 79edff12060f ("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9") which included the upstream commit 5e735860c478 ("libfdt: Check for 8-byte address alignment in fdt_ro_probe_()"). This broke the MIPS raw appended DTBs which would be appended to the image immediately following the initramfs section. This ends with a 32bit size, resulting in a 4-byte alignment of the DTB. Fix by padding with zeroes to 8-bytes when MIPS_RAW_APPENDED_DTB is defined. Fixes: 79edff12060f ("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9") Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-08MIPS: boot/compressed: Copy DTB to aligned addressPaul Cercueil1-0/+2
Since 5.12-rc1, the Device Tree blob must now be properly aligned. Therefore, the decompress routine must be careful to copy the blob at the next aligned address after the kernel image. This commit fixes the kernel sometimes not booting with a Device Tree blob appended to it. Fixes: 79edff12060f ("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9") Signed-off-by: Paul Cercueil <paul@crapouillou.net> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-03-06mips: smp-bmips: fix CPU mappingsÁlvaro Fernández Rojas1-10/+17
When booting bmips with SMP enabled on a BCM6358 running on CPU #1 instead of CPU #0, the current CPU mapping code produces the following: - smp_processor_id(): 0 - cpu_logical_map(0): 1 - cpu_number_map(0): 1 This is because SMP isn't supported on BCM6358 since it has a shared TLB, so it is disabled and max_cpus is decreased from 2 to 1. Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-02-27Merge tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+1
Pull io_uring thread rewrite from Jens Axboe: "This converts the io-wq workers to be forked off the tasks in question instead of being kernel threads that assume various bits of the original task identity. This kills > 400 lines of code from io_uring/io-wq, and it's the worst part of the code. We've had several bugs in this area, and the worry is always that we could be missing some pieces for file types doing unusual things (recent /dev/tty example comes to mind, userfaultfd reads installing file descriptors is another fun one... - both of which need special handling, and I bet it's not the last weird oddity we'll find). With these identical workers, we can have full confidence that we're never missing anything. That, in itself, is a huge win. Outside of that, it's also more efficient since we're not wasting space and code on tracking state, or switching between different states. I'm sure we're going to find little things to patch up after this series, but testing has been pretty thorough, from the usual regression suite to production. Any issue that may crop up should be manageable. There's also a nice series of further reductions we can do on top of this, but I wanted to get the meat of it out sooner rather than later. The general worry here isn't that it's fundamentally broken. Most of the little issues we've found over the last week have been related to just changes in how thread startup/exit is done, since that's the main difference between using kthreads and these kinds of threads. In fact, if all goes according to plan, I want to get this into the 5.10 and 5.11 stable branches as well. That said, the changes outside of io_uring/io-wq are: - arch setup, simple one-liner to each arch copy_thread() implementation. - Removal of net and proc restrictions for io_uring, they are no longer needed or useful" * tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-block: (30 commits) io-wq: remove now unused IO_WQ_BIT_ERROR io_uring: fix SQPOLL thread handling over exec io-wq: improve manager/worker handling over exec io_uring: ensure SQPOLL startup is triggered before error shutdown io-wq: make buffered file write hashed work map per-ctx io-wq: fix race around io_worker grabbing io-wq: fix races around manager/worker creation and task exit io_uring: ensure io-wq context is always destroyed for tasks arch: ensure parisc/powerpc handle PF_IO_WORKER in copy_thread() io_uring: cleanup ->user usage io-wq: remove nr_process accounting io_uring: flag new native workers with IORING_FEAT_NATIVE_WORKERS net: remove cmsg restriction from io_uring based send/recvmsg calls Revert "proc: don't allow async path resolution of /proc/self components" Revert "proc: don't allow async path resolution of /proc/thread-self components" io_uring: move SQPOLL thread io-wq forked worker io-wq: make io_wq_fork_thread() available to other users io-wq: only remove worker from free_list, if it was there io_uring: remove io_identity io_uring: remove any grabbing of context ...
2021-02-25Merge tag 'mips_5.12_1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull more MIPS updates from Thomas Bogendoerfer: - added n64 block driver - fix for ubsan warnings - fix for bcm63xx platform - update of linux-mips mailinglist * tag 'mips_5.12_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: arch: mips: update references to current linux-mips list mips: bmips: init clocks earlier vmlinux.lds.h: catch even more instrumentation symbols into .data n64: store dev instance into disk private data n64: cleanup n64cart_probe() n64: cosmetics changes n64: remove curly brackets n64: use sector SECTOR_SHIFT instead 512 n64: use enums for reg n64: move module param at the top n64: move module info at the end n64: use pr_fmt to avoid duplicate string block: Add n64 cart driver
2021-02-25Merge tag 'kbuild-v5.12' of ↵Linus Torvalds1-16/+17
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Fix false-positive build warnings for ARCH=ia64 builds - Optimize dictionary size for module compression with xz - Check the compiler and linker versions in Kconfig - Fix misuse of extra-y - Support DWARF v5 debug info - Clamp SUBLEVEL to 255 because stable releases 4.4.x and 4.9.x exceeded the limit - Add generic syscall{tbl,hdr}.sh for cleanups across arches - Minor cleanups of genksyms - Minor cleanups of Kconfig * tag 'kbuild-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (38 commits) initramfs: Remove redundant dependency of RD_ZSTD on BLK_DEV_INITRD kbuild: remove deprecated 'always' and 'hostprogs-y/m' kbuild: parse C= and M= before changing the working directory kbuild: reuse this-makefile to define abs_srctree kconfig: unify rule of config, menuconfig, nconfig, gconfig, xconfig kconfig: omit --oldaskconfig option for 'make config' kconfig: fix 'invalid option' for help option kconfig: remove dead code in conf_askvalue() kconfig: clean up nested if-conditionals in check_conf() kconfig: Remove duplicate call to sym_get_string_value() Makefile: Remove # characters from compiler string Makefile: reuse CC_VERSION_TEXT kbuild: check the minimum linker version in Kconfig kbuild: remove ld-version macro scripts: add generic syscallhdr.sh scripts: add generic syscalltbl.sh arch: syscalls: remove $(srctree)/ prefix from syscall tables arch: syscalls: add missing FORCE and fix 'targets' to make if_changed work gen_compile_commands: prune some directories kbuild: simplify access to the kernel's version ...
2021-02-23Merge tag 'idmapped-mounts-v5.12' of ↵Linus Torvalds3-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull idmapped mounts from Christian Brauner: "This introduces idmapped mounts which has been in the making for some time. Simply put, different mounts can expose the same file or directory with different ownership. This initial implementation comes with ports for fat, ext4 and with Christoph's port for xfs with more filesystems being actively worked on by independent people and maintainers. Idmapping mounts handle a wide range of long standing use-cases. Here are just a few: - Idmapped mounts make it possible to easily share files between multiple users or multiple machines especially in complex scenarios. For example, idmapped mounts will be used in the implementation of portable home directories in systemd-homed.service(8) where they allow users to move their home directory to an external storage device and use it on multiple computers where they are assigned different uids and gids. This effectively makes it possible to assign random uids and gids at login time. - It is possible to share files from the host with unprivileged containers without having to change ownership permanently through chown(2). - It is possible to idmap a container's rootfs and without having to mangle every file. For example, Chromebooks use it to share the user's Download folder with their unprivileged containers in their Linux subsystem. - It is possible to share files between containers with non-overlapping idmappings. - Filesystem that lack a proper concept of ownership such as fat can use idmapped mounts to implement discretionary access (DAC) permission checking. - They allow users to efficiently changing ownership on a per-mount basis without having to (recursively) chown(2) all files. In contrast to chown (2) changing ownership of large sets of files is instantenous with idmapped mounts. This is especially useful when ownership of a whole root filesystem of a virtual machine or container is changed. With idmapped mounts a single syscall mount_setattr syscall will be sufficient to change the ownership of all files. - Idmapped mounts always take the current ownership into account as idmappings specify what a given uid or gid is supposed to be mapped to. This contrasts with the chown(2) syscall which cannot by itself take the current ownership of the files it changes into account. It simply changes the ownership to the specified uid and gid. This is especially problematic when recursively chown(2)ing a large set of files which is commong with the aforementioned portable home directory and container and vm scenario. - Idmapped mounts allow to change ownership locally, restricting it to specific mounts, and temporarily as the ownership changes only apply as long as the mount exists. Several userspace projects have either already put up patches and pull-requests for this feature or will do so should you decide to pull this: - systemd: In a wide variety of scenarios but especially right away in their implementation of portable home directories. https://systemd.io/HOME_DIRECTORY/ - container runtimes: containerd, runC, LXD:To share data between host and unprivileged containers, unprivileged and privileged containers, etc. The pull request for idmapped mounts support in containerd, the default Kubernetes runtime is already up for quite a while now: https://github.com/containerd/containerd/pull/4734 - The virtio-fs developers and several users have expressed interest in using this feature with virtual machines once virtio-fs is ported. - ChromeOS: Sharing host-directories with unprivileged containers. I've tightly synced with all those projects and all of those listed here have also expressed their need/desire for this feature on the mailing list. For more info on how people use this there's a bunch of talks about this too. Here's just two recent ones: https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf https://fosdem.org/2021/schedule/event/containers_idmap/ This comes with an extensive xfstests suite covering both ext4 and xfs: https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts It covers truncation, creation, opening, xattrs, vfscaps, setid execution, setgid inheritance and more both with idmapped and non-idmapped mounts. It already helped to discover an unrelated xfs setgid inheritance bug which has since been fixed in mainline. It will be sent for inclusion with the xfstests project should you decide to merge this. In order to support per-mount idmappings vfsmounts are marked with user namespaces. The idmapping of the user namespace will be used to map the ids of vfs objects when they are accessed through that mount. By default all vfsmounts are marked with the initial user namespace. The initial user namespace is used to indicate that a mount is not idmapped. All operations behave as before and this is verified in the testsuite. Based on prior discussions we want to attach the whole user namespace and not just a dedicated idmapping struct. This allows us to reuse all the helpers that already exist for dealing with idmappings instead of introducing a whole new range of helpers. In addition, if we decide in the future that we are confident enough to enable unprivileged users to setup idmapped mounts the permission checking can take into account whether the caller is privileged in the user namespace the mount is currently marked with. The user namespace the mount will be marked with can be specified by passing a file descriptor refering to the user namespace as an argument to the new mount_setattr() syscall together with the new MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern of extensibility. The following conditions must be met in order to create an idmapped mount: - The caller must currently have the CAP_SYS_ADMIN capability in the user namespace the underlying filesystem has been mounted in. - The underlying filesystem must support idmapped mounts. - The mount must not already be idmapped. This also implies that the idmapping of a mount cannot be altered once it has been idmapped. - The mount must be a detached/anonymous mount, i.e. it must have been created by calling open_tree() with the OPEN_TREE_CLONE flag and it must not already have been visible in the filesystem. The last two points guarantee easier semantics for userspace and the kernel and make the implementation significantly simpler. By default vfsmounts are marked with the initial user namespace and no behavioral or performance changes are observed. The manpage with a detailed description can be found here: https://git.kernel.org/brauner/man-pages/c/1d7b902e2875a1ff342e036a9f866a995640aea8 In order to support idmapped mounts, filesystems need to be changed and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The patches to convert individual filesystem are not very large or complicated overall as can be seen from the included fat, ext4, and xfs ports. Patches for other filesystems are actively worked on and will be sent out separately. The xfstestsuite can be used to verify that port has been done correctly. The mount_setattr() syscall is motivated independent of the idmapped mounts patches and it's been around since July 2019. One of the most valuable features of the new mount api is the ability to perform mounts based on file descriptors only. Together with the lookup restrictions available in the openat2() RESOLVE_* flag namespace which we added in v5.6 this is the first time we are close to hardened and race-free (e.g. symlinks) mounting and path resolution. While userspace has started porting to the new mount api to mount proper filesystems and create new bind-mounts it is currently not possible to change mount options of an already existing bind mount in the new mount api since the mount_setattr() syscall is missing. With the addition of the mount_setattr() syscall we remove this last restriction and userspace can now fully port to the new mount api, covering every use-case the old mount api could. We also add the crucial ability to recursively change mount options for a whole mount tree, both removing and adding mount options at the same time. This syscall has been requested multiple times by various people and projects. There is a simple tool available at https://github.com/brauner/mount-idmapped that allows to create idmapped mounts so people can play with this patch series. I'll add support for the regular mount binary should you decide to pull this in the following weeks: Here's an example to a simple idmapped mount of another user's home directory: u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt u1001@f2-vm:/$ ls -al /home/ubuntu/ total 28 drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 . drwxr-xr-x 4 root root 4096 Oct 28 04:00 .. -rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 ubuntu ubuntu 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 ubuntu ubuntu 807 Feb 25 2020 .profile -rw-r--r-- 1 ubuntu ubuntu 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ ls -al /mnt/ total 28 drwxr-xr-x 2 u1001 u1001 4096 Oct 28 22:07 . drwxr-xr-x 29 root root 4096 Oct 28 22:01 .. -rw------- 1 u1001 u1001 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 u1001 u1001 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 u1001 u1001 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 u1001 u1001 807 Feb 25 2020 .profile -rw-r--r-- 1 u1001 u1001 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 u1001 u1001 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ touch /mnt/my-file u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file u1001@f2-vm:/$ ls -al /mnt/my-file -rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file u1001@f2-vm:/$ ls -al /home/ubuntu/my-file -rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file u1001@f2-vm:/$ getfacl /mnt/my-file getfacl: Removing leading '/' from absolute path names # file: mnt/my-file # owner: u1001 # group: u1001 user::rw- user:u1001:rwx group::rw- mask::rwx other::r-- u1001@f2-vm:/$ getfacl /home/ubuntu/my-file getfacl: Removing leading '/' from absolute path names # file: home/ubuntu/my-file # owner: ubuntu # group: ubuntu user::rw- user:ubuntu:rwx group::rw- mask::rwx other::r--" * tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits) xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl xfs: support idmapped mounts ext4: support idmapped mounts fat: handle idmapped mounts tests: add mount_setattr() selftests fs: introduce MOUNT_ATTR_IDMAP fs: add mount_setattr() fs: add attr_flags_to_mnt_flags helper fs: split out functions to hold writers namespace: only take read lock in do_reconfigure_mnt() mount: make {lock,unlock}_mount_hash() static namespace: take lock_mount_hash() directly when changing flags nfs: do not export idmapped mounts overlayfs: do not mount on top of idmapped mounts ecryptfs: do not mount on top of idmapped mounts ima: handle idmapped mounts apparmor: handle idmapped mounts fs: make helpers idmap mount aware exec: handle idmapped mounts would_dump: handle idmapped mounts ...