Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma DMA mapping updates from Doug Ledford:
"Drop IB DMA mapping code and use core DMA code instead.
Bart Van Assche noted that the ib DMA mapping code was significantly
similar enough to the core DMA mapping code that with a few changes it
was possible to remove the IB DMA mapping code entirely and switch the
RDMA stack to use the core DMA mapping code.
This resulted in a nice set of cleanups, but touched the entire tree
and has been kept separate for that reason."
* tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits)
IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it
IB/core: Remove ib_device.dma_device
nvme-rdma: Switch from dma_device to dev.parent
RDS: net: Switch from dma_device to dev.parent
IB/srpt: Modify a debug statement
IB/srp: Switch from dma_device to dev.parent
IB/iser: Switch from dma_device to dev.parent
IB/IPoIB: Switch from dma_device to dev.parent
IB/rxe: Switch from dma_device to dev.parent
IB/vmw_pvrdma: Switch from dma_device to dev.parent
IB/usnic: Switch from dma_device to dev.parent
IB/qib: Switch from dma_device to dev.parent
IB/qedr: Switch from dma_device to dev.parent
IB/ocrdma: Switch from dma_device to dev.parent
IB/nes: Remove a superfluous assignment statement
IB/mthca: Switch from dma_device to dev.parent
IB/mlx5: Switch from dma_device to dev.parent
IB/mlx4: Switch from dma_device to dev.parent
IB/i40iw: Remove a superfluous assignment statement
IB/hns: Switch from dma_device to dev.parent
...
|
|
The callers of the DMA alloc functions already provide the proper
context GFP flags. Make sure to pass them through to the CMA allocator,
to make the CMA compaction context aware.
Link: http://lkml.kernel.org/r/20170127172328.18574-3-l.stach@pengutronix.de
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
- Errata workarounds for Qualcomm's Falkor CPU
- Qualcomm L2 Cache PMU driver
- Qualcomm SMCCC firmware quirk
- Support for DEBUG_VIRTUAL
- CPU feature detection for userspace via MRS emulation
- Preliminary work for the Statistical Profiling Extension
- Misc cleanups and non-critical fixes
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (74 commits)
arm64/kprobes: consistently handle MRS/MSR with XZR
arm64: cpufeature: correctly handle MRS to XZR
arm64: traps: correctly handle MRS/MSR with XZR
arm64: ptrace: add XZR-safe regs accessors
arm64: include asm/assembler.h in entry-ftrace.S
arm64: fix warning about swapper_pg_dir overflow
arm64: Work around Falkor erratum 1003
arm64: head.S: Enable EL1 (host) access to SPE when entered at EL2
arm64: arch_timer: document Hisilicon erratum 161010101
arm64: use is_vmalloc_addr
arm64: use linux/sizes.h for constants
arm64: uaccess: consistently check object sizes
perf: add qcom l2 cache perf events driver
arm64: remove wrong CONFIG_PROC_SYSCTL ifdef
ARM: smccc: Update HVC comment to describe new quirk parameter
arm64: do not trace atomic operations
ACPI/IORT: Fix the error return code in iort_add_smmu_platform_device()
ACPI/IORT: Fix iort_node_get_id() mapping entries indexing
arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
perf: xgene: Include module.h
...
|
|
With 4 levels of 16KB pages, we get this warning about the fact that we are
copying a whole page into an array that is declared as having only two pointers
for the top level of the page table:
arch/arm64/mm/mmu.c: In function 'paging_init':
arch/arm64/mm/mmu.c:528:2: error: 'memcpy' writing 16384 bytes into a region of size 16 overflows the destination [-Werror=stringop-overflow=]
This is harmless since we actually reserve a whole page in the definition of the
array that comes from, and just the extern declaration is short. The pgdir
is initialized to zero either way, so copying the actual entries here seems
like the best solution.
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
The Qualcomm Datacenter Technologies Falkor v1 CPU may allocate TLB entries
using an incorrect ASID when TTBRx_EL1 is being updated. When the erratum
is triggered, page table entries using the new translation table base
address (BADDR) will be allocated into the TLB using the old ASID. All
circumstances leading to the incorrect ASID being cached in the TLB arise
when software writes TTBRx_EL1[ASID] and TTBRx_EL1[BADDR], a memory
operation is in the process of performing a translation using the specific
TTBRx_EL1 being written, and the memory operation uses a translation table
descriptor designated as non-global. EL2 and EL3 code changing the EL1&0
ASID is not subject to this erratum because hardware is prohibited from
performing translations from an out-of-context translation regime.
Consider the following pseudo code.
write new BADDR and ASID values to TTBRx_EL1
Replacing the above sequence with the one below will ensure that no TLB
entries with an incorrect ASID are used by software.
write reserved value to TTBRx_EL1[ASID]
ISB
write new value to TTBRx_EL1[BADDR]
ISB
write new value to TTBRx_EL1[ASID]
ISB
When the above sequence is used, page table entries using the new BADDR
value may still be incorrectly allocated into the TLB using the reserved
ASID. Yet this will not reduce functionality, since TLB entries incorrectly
tagged with the reserved ASID will never be hit by a later instruction.
Based on work by Shanker Donthineni <shankerd@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
To is_vmalloc_addr() to check if an address is a vmalloc address
instead of checking VMALLOC_START and VMALLOC_END manually.
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Back when this was first written, dma_supported() was somewhat of a
murky mess, with subtly different interpretations being relied upon in
various places. The "does device X support DMA to address range Y?"
uses assuming Y to be physical addresses, which motivated the current
iommu_dma_supported() implementation and are alluded to in the comment
therein, have since been cleaned up, leaving only the far less ambiguous
"can device X drive address bits Y" usage internal to DMA API mask
setting. As such, there is no reason to keep a slightly misleading
callback which does nothing but duplicate the current default behaviour;
we already constrain IOVA allocations to the iommu_domain aperture where
necessary, so let's leave DMA mask business to architecture-specific
code where it belongs.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/core
|
|
When bypassing SWIOTLB on small-memory systems, we need to avoid calling
into swiotlb_dma_mapping_error() in exactly the same way as we avoid
swiotlb_dma_supported(), because the former also relies on SWIOTLB state
being initialised.
Under the assumptions for which we skip SWIOTLB, dma_map_{single,page}()
will only ever return the DMA-offset-adjusted physical address of the
page passed in, thus we can report success unconditionally.
Fixes: b67a8b29df7e ("arm64: mm: only initialize swiotlb when necessary")
CC: stable@vger.kernel.org
CC: Jisheng Zhang <jszhang@marvell.com>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Some but not all architectures provide set_dma_ops(). Move dma_ops
from struct dev_archdata into struct device such that it becomes
possible on all architectures to configure dma_ops per device.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Most dma_map_ops structures are never modified. Constify these
structures such that these can be write-protected. This patch
has been generated as follows:
git grep -l 'struct dma_map_ops' |
xargs -d\\n sed -i \
-e 's/struct dma_map_ops/const struct dma_map_ops/g' \
-e 's/const struct dma_map_ops {/struct dma_map_ops {/g' \
-e 's/^const struct dma_map_ops;$/struct dma_map_ops;/' \
-e 's/const const struct dma_map_ops /const struct dma_map_ops /g';
sed -i -e 's/const \(struct dma_map_ops intel_dma_ops\)/\1/' \
$(git grep -l 'struct dma_map_ops intel_dma_ops');
sed -i -e 's/const \(struct dma_map_ops dma_iommu_ops\)/\1/' \
$(git grep -l 'struct dma_map_ops' | grep ^arch/powerpc);
sed -i -e '/^struct vmd_dev {$/,/^};$/ s/const \(struct dma_map_ops[[:blank:]]dma_ops;\)/\1/' \
-e '/^static void vmd_setup_dma_ops/,/^}$/ s/const \(struct dma_map_ops \*dest\)/\1/' \
-e 's/const \(struct dma_map_ops \*dest = \&vmd->dma_ops\)/\1/' \
drivers/pci/host/*.c
sed -i -e '/^void __init pci_iommu_alloc(void)$/,/^}$/ s/dma_ops->/intel_dma_ops./' arch/ia64/kernel/pci-dma.c
sed -i -e 's/static const struct dma_map_ops sn_dma_ops/static struct dma_map_ops sn_dma_ops/' arch/ia64/sn/pci/pci_dma.c
sed -i -e 's/(const struct dma_map_ops \*)//' drivers/misc/mic/bus/vop_bus.c
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The arm64 DMA-mapping implementation sets the DMA ops to the IOMMU DMA
ops if we detect that an IOMMU is present for the master and the DMA
ranges are valid.
In the case when the IOMMU domain for the device is not of type
IOMMU_DOMAIN_DMA, then we have no business swizzling the ops, since
we're not in control of the underlying address space. This patch leaves
the DMA ops alone for masters attached to non-DMA IOMMU domains.
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
The newly added DMA_ATTR_PRIVILEGED is useful for creating mappings that
are only accessible to privileged DMA engines. Implement it in
dma-iommu.c so that the ARM64 DMA IOMMU mapper can make use of it.
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Commit b67a8b29df introduced logic to skip swiotlb allocation when all memory
is DMA accessible anyway.
While this is a great idea, __dma_alloc still calls swiotlb code unconditionally
to allocate memory when there is no CMA memory available. The swiotlb code is
called to ensure that we at least try get_free_pages().
Without initialization, swiotlb allocation code tries to access io_tlb_list
which is NULL. That results in a stack trace like this:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
[...]
[<ffff00000845b908>] swiotlb_tbl_map_single+0xd0/0x2b0
[<ffff00000845be94>] swiotlb_alloc_coherent+0x10c/0x198
[<ffff000008099dc0>] __dma_alloc+0x68/0x1a8
[<ffff000000a1b410>] drm_gem_cma_create+0x98/0x108 [drm]
[<ffff000000abcaac>] drm_fbdev_cma_create_with_funcs+0xbc/0x368 [drm_kms_helper]
[<ffff000000abcd84>] drm_fbdev_cma_create+0x2c/0x40 [drm_kms_helper]
[<ffff000000abc040>] drm_fb_helper_initial_config+0x238/0x410 [drm_kms_helper]
[<ffff000000abce88>] drm_fbdev_cma_init_with_funcs+0x98/0x160 [drm_kms_helper]
[<ffff000000abcf90>] drm_fbdev_cma_init+0x40/0x58 [drm_kms_helper]
[<ffff000000b47980>] vc4_kms_load+0x90/0xf0 [vc4]
[<ffff000000b46a94>] vc4_drm_bind+0xec/0x168 [vc4]
[...]
Thankfully swiotlb code just learned how to not do allocations with the FORCE_NO
option. This patch configures the swiotlb code to use that if we decide not to
initialize the swiotlb framework.
Fixes: b67a8b29df ("arm64: mm: only initialize swiotlb when necessary")
Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Jisheng Zhang <jszhang@marvell.com>
CC: Geert Uytterhoeven <geert+renesas@glider.be>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Cosmetic change to use phys_addr_t instead of unsigned long for the
return value of __pa_symbol().
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
This patch adds support for DMA_ATTR_SKIP_CPU_SYNC attribute for
dma_{un}map_{page,sg} functions family to swiotlb.
DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
the CPU cache for the given buffer assuming that it has been already
transferred to 'device' domain.
Ported from IOMMU .{un}map_{sg,page} ops.
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Takeshi Kihara <takeshi.kihara.df@renesas.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
x86 has an option CONFIG_DEBUG_VIRTUAL to do additional checks
on virt_to_phys calls. The goal is to catch users who are calling
virt_to_phys on non-linear addresses immediately. This inclues callers
using virt_to_phys on image addresses instead of __pa_symbol. As features
such as CONFIG_VMAP_STACK get enabled for arm64, this becomes increasingly
important. Add checks to catch bad virt_to_phys usage.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
__pa_symbol is technically the marcro that should be used for kernel
symbols. Switch to this as a pre-requisite for DEBUG_VIRTUAL which
will do bounds checking.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
In current code, the @changed always returns the last one's status for
the huge page with the contiguous bit set. This is really not what we
want. Even one of the PTEs is changed, we should tell it to the caller.
This patch fixes this issue.
Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit")
Cc: <stable@vger.kernel.org> # 4.5.x-
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Since its introduction, the UAO enable call was broken, and useless.
commit 2a6dcb2b5f3e ("arm64: cpufeature: Schedule enable() calls instead
of calling them via IPI"), fixed the framework so that these calls
are scheduled, so that they can modify PSTATE.
Now it is just useless. Remove it. UAO is enabled by the code patching
which causes get_user() and friends to use the 'ldtr' family of
instructions. This relies on the PSTATE.UAO bit being set to match
addr_limit, which we do in uao_thread_switch() called via __switch_to().
All that is needed to enable UAO is patch the code, and call schedule().
__apply_alternatives_multi_stop() calls stop_machine() when it modifies
the kernel text to enable the alternatives, (including the UAO code in
uao_thread_switch()). Once stop_machine() has finished __switch_to() is
called to reschedule the original task, this causes PSTATE.UAO to be set
appropriately. An explicit enable() call is not needed.
Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- re-introduce the arm64 get_current() optimisation
- KERN_CONT fallout fix in show_pte()
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: restore get_current() optimisation
arm64: mm: fix show_pte KERN_CONT fallout
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb fixes from Konrad Rzeszutek Wilk:
"This has one fix to make i915 work when using Xen SWIOTLB, and a
feature from Geert to aid in debugging of devices that can't do DMA
outside the 32-bit address space.
The feature from Geert is on top of v4.10 merge window commit
(specifically you pulling my previous branch), as his changes were
dependent on the Documentation/ movement patches.
I figured it would just easier than me trying than to cherry-pick the
Documentation patches to satisfy git.
The patches have been soaking since 12/20, albeit I updated the last
patch due to linux-next catching an compiler error and adding an
Tested-and-Reported-by tag"
* 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
swiotlb: Export swiotlb_max_segment to users
swiotlb: Add swiotlb=noforce debug option
swiotlb: Convert swiotlb_force from int to enum
x86, swiotlb: Simplify pci_swiotlb_detect_override()
|
|
Recent changes made KERN_CONT mandatory for continued lines. In the
absence of KERN_CONT, a newline may be implicit inserted by the core
printk code.
In show_pte, we (erroneously) use printk without KERN_CONT for continued
prints, resulting in output being split across a number of lines, and
not matching the intended output, e.g.
[ff000000000000] *pgd=00000009f511b003
, *pud=00000009f4a80003
, *pmd=0000000000000000
Fix this by using pr_cont() for all the continuations.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Split asm-only parts of arm64 uaccess.h into a new header and use that
from *.S.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)
to do the replacement at the end of the merge window.
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Convert the flag swiotlb_force from an int to an enum, to prepare for
the advent of more possible values.
Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes and cleanups from Thomas Gleixner:
"This set of updates contains:
- Robustification for the logical package managment. Cures the AMD
and virtualization issues.
- Put the correct start_cpu() return address on the stack of the idle
task.
- Fixups for the fallout of the nodeid <-> cpuid persistent mapping
modifciations
- Move the x86/MPX specific mm_struct member to the arch specific
mm_context where it belongs
- Cleanups for C89 struct initializers and useless function
arguments"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/floppy: Use designated initializers
x86/mpx: Move bd_addr to mm_context_t
x86/mm: Drop unused argument 'removed' from sync_global_pgds()
ACPI/NUMA: Do not map pxm to node when NUMA is turned off
x86/acpi: Use proper macro for invalid node
x86/smpboot: Prevent false positive out of bounds cpumask access warning
x86/boot/64: Push correct start_cpu() return address
x86/boot/64: Use 'push' instead of 'call' in start_cpu()
x86/smpboot: Make logical package management more robust
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"These changes include:
- support for the ACPI IORT table on ARM systems and patches to make
the ARM-SMMU driver make use of it
- conversion of the Exynos IOMMU driver to device dependency links
and implementation of runtime pm support based on that conversion
- update the Mediatek IOMMU driver to use the new struct
device->iommu_fwspec member
- implementation of dma_map/unmap_resource in the generic ARM
dma-iommu layer
- a number of smaller fixes and improvements all over the place"
* tag 'iommu-updates-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (44 commits)
ACPI/IORT: Make dma masks set-up IORT specific
iommu/amd: Missing error code in amd_iommu_init_device()
iommu/s390: Drop duplicate header pci.h
ACPI/IORT: Introduce iort_iommu_configure
ACPI/IORT: Add single mapping function
ACPI/IORT: Replace rid map type with type mask
iommu/arm-smmu: Add IORT configuration
iommu/arm-smmu: Split probe functions into DT/generic portions
iommu/arm-smmu-v3: Add IORT configuration
iommu/arm-smmu-v3: Split probe functions into DT/generic portions
ACPI/IORT: Add support for ARM SMMU platform devices creation
ACPI/IORT: Add node match function
ACPI: Implement acpi_dma_configure
iommu/arm-smmu-v3: Convert struct device of_node to fwnode usage
iommu/arm-smmu: Convert struct device of_node to fwnode usage
iommu: Make of_iommu_set/get_ops() DT agnostic
ACPI/IORT: Add support for IOMMU fwnode registration
ACPI/IORT: Introduce linker section for IORT entries probing
ACPI: Add FWNODE_ACPI_STATIC fwnode type
iommu/arm-smmu: Set SMTNMB_TLBEN in ACR to enable caching of bypass entries
...
|
|
acpi_map_pxm_to_node() unconditially maps nodes even when NUMA is turned
off. So acpi_get_node() might return a node > 0, which is fatal when NUMA
is disabled as the rest of the kernel assumes that only node 0 exists.
Expose numa_off to the acpi code and return NUMA_NO_NODE when it's set.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: linux-ia64@vger.kernel.org
Cc: catalin.marinas@arm.com
Cc: rjw@rjwysocki.net
Cc: will.deacon@arm.com
Cc: linux-acpi@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1481602709-18260-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The flush_cache_range() function (similarly for flush_cache_page()) is
called when the kernel is changing an existing VA->PA mapping range to
either a new PA or to different attributes. Since ARMv8 has PIPT-like
D-caches, this function does not need to perform any D-cache
maintenance. The I-cache maintenance is already handled via set_pte_at()
and flush_cache_range() cannot anyway guarantee that there are no cache
lines left after invalidation due to the speculative loads.
This patch makes flush_cache_range() a no-op.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
When TTBR0_EL1 is set to the reserved page, an erroneous kernel access
to user space would generate a translation fault. This patch adds the
checks for the software-set PSR_PAN_BIT to emulate a permission fault
and report it accordingly.
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
When the TTBR0 PAN feature is enabled, the kernel entry points need to
disable access to TTBR0_EL1. The PAN status of the interrupted context
is stored as part of the saved pstate, reusing the PSR_PAN_BIT (22).
Restoring access to TTBR0_EL1 is done on exception return if returning
to user or returning to a context where PAN was disabled.
Context switching via switch_mm() must defer the update of TTBR0_EL1
until a return to user or an explicit uaccess_enable() call.
Special care needs to be taken for two cases where TTBR0_EL1 is set
outside the normal kernel context switch operation: EFI run-time
services (via efi_set_pgd) and CPU suspend (via cpu_(un)install_idmap).
Code has been added to avoid deferred TTBR0_EL1 switching as in
switch_mm() and restore the reserved TTBR0_EL1 when uninstalling the
special TTBR0_EL1.
User cache maintenance (user_cache_maint_handler and
__flush_cache_user_range) needs the TTBR0_EL1 re-instated since the
operations are performed by user virtual address.
This patch also removes a stale comment on the switch_mm() function.
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
This patch takes the errata workaround code out of cpu_do_switch_mm into
a dedicated post_ttbr0_update_workaround macro which will be reused in a
subsequent patch.
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
This patch updates the description of the synchronous external aborts on
translation table walks.
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
With no coherency to worry about, just plug'em straight in.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
When returning from idle, we rely on the fact that thread_info lives at
the end of the kernel stack, and restore this by masking the saved stack
pointer. Subsequent patches will sever the relationship between the
stack and thread_info, and to cater for this we must save/restore sp_el0
explicitly, storing it in cpu_suspend_ctx.
As cpu_suspend_ctx must be doubleword aligned, this leaves us with an
extra slot in cpu_suspend_ctx. We can use this to save/restore tpidr_el1
in the same way, which simplifies the code, avoiding pointer chasing on
the restore path (as we no longer need to load thread_info::cpu followed
by the relevant slot in __per_cpu_offset based on this).
This patch stashes both registers in cpu_suspend_ctx.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The libhugetlbfs meets several failures since the following functions
do not use the correct address:
huge_ptep_get_and_clear()
huge_ptep_set_access_flags()
huge_ptep_set_wrprotect()
huge_ptep_clear_flush()
This patch fixes the wrong address for them.
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The find_num_contig() will return 1 when the pmd is not present.
It will cause a kernel dead loop in the following scenaro:
1.) pmd entry is not present.
2.) the page fault occurs:
... hugetlb_fault() --> hugetlb_no_page() --> set_huge_pte_at()
3.) set_huge_pte_at() will only set the first PMD entry, since the
find_num_contig just return 1 in this case. So the PMD entries
are all empty except the first one.
4.) when kernel accesses the address mapped by the second PMD entry,
a new page fault occurs:
... hugetlb_fault() --> huge_ptep_set_access_flags()
The second PMD entry is still empty now.
5.) When the kernel returns, the access will cause a page fault again.
The kernel will run like the "4)" above.
We will see a dead loop since here.
The dead loop is caught in the 32M hugetlb page (2M PMD + Contiguous bit).
This patch removes wrong pmd check, and fixes this dead loop.
This patch also removes the redundant checks for PGD/PUD in
the find_num_contig().
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The default hugepage size when 64K pages are enabled is set to 2MB using
the contiguous PTE bit. The add_default_hugepagesz(), however, uses
CONT_PMD_SHIFT instead of CONT_PTE_SHIFT. There is no functional change
since the values are the same.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
This patch adds support for uprobe on ARM64 architecture.
Unit tests for following have been done so far and they have been found
working
1. Step-able instructions, like sub, ldr, add etc.
2. Simulation-able like ret, cbnz, cbz etc.
3. uretprobe
4. Reject-able instructions like sev, wfe etc.
5. trapped and abort xol path
6. probe at unaligned user address.
7. longjump test cases
Currently it does not support aarch32 instruction probing.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Page mappings with full RWX permissions are a security risk. x86
has an option to walk the page tables and dump any bad pages.
(See e1a58320a38d ("x86/mm: Warn on W^X mappings")). Add a similar
implementation for arm64.
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[catalin.marinas@arm.com: folded fix for KASan out of bounds from Mark Rutland]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The page table dumping code always assumes it will be dumping to a
seq_file to userspace. Future code will be taking advantage of
the page table dumping code but will not need the seq_file. Make
the seq_file optional for these cases.
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
ptdump_register currently initializes a set of page table information and
registers debugfs. There are uses for the ptdump option without wanting the
debugfs options. Split this out to make it a separate option.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Now that we no longer allow live kernel PMDs to be split, it is safe to
start using the contiguous bit for kernel mappings. So set the contiguous
bit in the kernel page mappings for regions whose size and alignment are
suitable for this.
This enables the following contiguous range sizes for the virtual mapping
of the kernel image, and for the linear mapping:
granule size | cont PTE | cont PMD |
-------------+------------+------------+
4 KB | 64 KB | 32 MB |
16 KB | 2 MB | 1 GB* |
64 KB | 2 MB | 16 GB* |
* Only when built for 3 or more levels of translation. This is due to the
fact that a 2 level configuration only consists of PGDs and PTEs, and the
added complexity of dealing with folded PMDs is not justified considering
that 16 GB contiguous ranges are likely to be ignored by the hardware (and
16k/2 levels is a niche configuration)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
In preparation of adding support for contiguous PTE and PMD mappings,
let's replace 'block_mappings_allowed' with 'page_mappings_only', which
will be a more accurate description of the nature of the setting once we
add such contiguous mappings into the mix.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Now that we take care not manipulate the live kernel page tables in a
way that may lead to TLB conflicts, the case where a table mapping is
replaced by a block mapping can no longer occur. So remove the handling
of this at the PUD and PMD levels, and instead, BUG() on any occurrence
of live kernel page table manipulations that modify anything other than
the permission bits.
Since mark_rodata_ro() is the only caller where the kernel mappings that
are being manipulated are actually live, drop the various conditional
flush_tlb_all() invocations, and add a single call to mark_rodata_ro()
instead.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
We expect arch_teardown_dma_ops() to be called very late in a device's
life, after it has been removed from its bus, and thus after the IOMMU
bus notifier has run. As such, even if this funny little check did make
sense, it's unlikely to achieve what it thinks it's trying to do anyway.
It's a residual trace of an earlier implementation which didn't belong
here from the start; belatedly snuff it out.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
When booting on NUMA system with memory-less node (no
memory dimm on this memory controller), the print
for setup_node_data() is incorrect:
NUMA: Initmem setup node 2 [mem 0x00000000-0xffffffffffffffff]
It can be fixed by printing [mem 0x00000000-0x00000000] when
end_pfn is 0, but print <memory-less node> will be more useful.
Fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.")
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
The pcpu_build_alloc_info() function group CPUs according to their
proximity, by call callback function @cpu_distance_fn from different
ARCHs.
For arm64 the callback of @cpu_distance_fn is
pcpu_cpu_distance(from, to)
-> node_distance(from, to)
The @from and @to for function node_distance() should be nid.
However, pcpu_cpu_distance() in arch/arm64/mm/numa.c just past the
cpu id for @from and @to, and didn't convert to numa node id.
For this incorrect cpu proximity get from ARCH, it may cause each CPU
in one group and make group_cnt out of bound:
setup_per_cpu_areas()
pcpu_embed_first_chunk()
pcpu_build_alloc_info()
in pcpu_build_alloc_info, since cpu_distance_fn will return
REMOTE_DISTANCE if we pass cpu ids (0,1,2...), so
cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE will wrongly be ture.
This may results in triggering the BUG_ON(unit != nr_units) later:
[ 0.000000] kernel BUG at mm/percpu.c:1916!
[ 0.000000] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc1-00003-g14155ca-dirty #26
[ 0.000000] Hardware name: Hisilicon Hi1616 Evaluation Board (DT)
[ 0.000000] task: ffff000008d6e900 task.stack: ffff000008d60000
[ 0.000000] PC is at pcpu_embed_first_chunk+0x420/0x704
[ 0.000000] LR is at pcpu_embed_first_chunk+0x3bc/0x704
[ 0.000000] pc : [<ffff000008c754f4>] lr : [<ffff000008c75490>] pstate: 800000c5
[ 0.000000] sp : ffff000008d63eb0
[ 0.000000] x29: ffff000008d63eb0 [ 0.000000] x28: 0000000000000000
[ 0.000000] x27: 0000000000000040 [ 0.000000] x26: ffff8413fbfcef00
[ 0.000000] x25: 0000000000000042 [ 0.000000] x24: 0000000000000042
[ 0.000000] x23: 0000000000001000 [ 0.000000] x22: 0000000000000046
[ 0.000000] x21: 0000000000000001 [ 0.000000] x20: ffff000008cb3bc8
[ 0.000000] x19: ffff8413fbfcf570 [ 0.000000] x18: 0000000000000000
[ 0.000000] x17: ffff000008e49ae0 [ 0.000000] x16: 0000000000000003
[ 0.000000] x15: 000000000000001e [ 0.000000] x14: 0000000000000004
[ 0.000000] x13: 0000000000000000 [ 0.000000] x12: 000000000000006f
[ 0.000000] x11: 00000413fbffff00 [ 0.000000] x10: 0000000000000004
[ 0.000000] x9 : 0000000000000000 [ 0.000000] x8 : 0000000000000001
[ 0.000000] x7 : ffff8413fbfcf63c [ 0.000000] x6 : ffff000008d65d28
[ 0.000000] x5 : ffff000008d65e50 [ 0.000000] x4 : 0000000000000000
[ 0.000000] x3 : ffff000008cb3cc8 [ 0.000000] x2 : 0000000000000040
[ 0.000000] x1 : 0000000000000040 [ 0.000000] x0 : 0000000000000000
[...]
[ 0.000000] Call trace:
[ 0.000000] Exception stack(0xffff000008d63ce0 to 0xffff000008d63e10)
[ 0.000000] 3ce0: ffff8413fbfcf570 0001000000000000 ffff000008d63eb0 ffff000008c754f4
[ 0.000000] 3d00: ffff000008d63d50 ffff0000081af210 00000413fbfff010 0000000000001000
[ 0.000000] 3d20: ffff000008d63d50 ffff0000081af220 00000413fbfff010 0000000000001000
[ 0.000000] 3d40: 00000413fbfcef00 0000000000000004 ffff000008d63db0 ffff0000081af390
[ 0.000000] 3d60: 00000413fbfcef00 0000000000001000 0000000000000000 0000000000001000
[ 0.000000] 3d80: 0000000000000000 0000000000000040 0000000000000040 ffff000008cb3cc8
[ 0.000000] 3da0: 0000000000000000 ffff000008d65e50 ffff000008d65d28 ffff8413fbfcf63c
[ 0.000000] 3dc0: 0000000000000001 0000000000000000 0000000000000004 00000413fbffff00
[ 0.000000] 3de0: 000000000000006f 0000000000000000 0000000000000004 000000000000001e
[ 0.000000] 3e00: 0000000000000003 ffff000008e49ae0
[ 0.000000] [<ffff000008c754f4>] pcpu_embed_first_chunk+0x420/0x704
[ 0.000000] [<ffff000008c6658c>] setup_per_cpu_areas+0x38/0xc8
[ 0.000000] [<ffff000008c608d8>] start_kernel+0x10c/0x390
[ 0.000000] [<ffff000008c601d8>] __primary_switched+0x5c/0x64
[ 0.000000] Code: b8018660 17ffffd7 6b16037f 54000080 (d4210000)
[ 0.000000] ---[ end trace 0000000000000000 ]---
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
Fix by getting cpu's node id with early_cpu_to_node() then pass it
to node_distance() as the original intention.
Fixes: 7af3a0a99252 ("arm64/numa: support HAVE_SETUP_PER_CPU_AREA")
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
All the lines printed by mem_init are independent, with each ending with
a newline. While they logically form a large block, none are actually
continuations of previous lines.
The kernel-side printk code and the userspace demsg tool differ in their
handling of KERN_CONT following a newline, and while this isn't always a
problem kernel-side, it does cause difficulty for userspace. Using
pr_cont causes the userspace tool to not print line prefix (e.g.
timestamps) even when following a newline, mis-aligning the output and
making it harder to read, e.g.
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] modules : 0xffff000000000000 - 0xffff000008000000 ( 128 MB)
vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000 (129022 GB)
.text : 0xffff000008080000 - 0xffff0000088b0000 ( 8384 KB)
.rodata : 0xffff0000088b0000 - 0xffff000008c50000 ( 3712 KB)
.init : 0xffff000008c50000 - 0xffff000008d50000 ( 1024 KB)
.data : 0xffff000008d50000 - 0xffff000008e25200 ( 853 KB)
.bss : 0xffff000008e25200 - 0xffff000008e6bec0 ( 284 KB)
fixed : 0xffff7dfffe7fd000 - 0xffff7dfffec00000 ( 4108 KB)
PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000 ( 16 MB)
vmemmap : 0xffff7e0000000000 - 0xffff800000000000 ( 2048 GB maximum)
0xffff7e0000000000 - 0xffff7e0026000000 ( 608 MB actual)
memory : 0xffff800000000000 - 0xffff800980000000 ( 38912 MB)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1
Fix this by using pr_notice consistently for all lines, which both the
kernel and userspace are happy with.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|