Age | Commit message (Collapse) | Author | Files | Lines |
|
Also remove the dependency on ARCv2, to increase compile coverage for
!ARCV2 builds
Acked-by: Daniel Lezcano <daniel.lezcnao@linaro.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
ARC timers use aux registers for programming and this paves way for
moving ARC timer drivers into drivers/clocksource
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
commit 1c3c90930392 broke PAE40. Macro pfn_pte(pfn, prot) creates paddr
from pfn, but the page shift was getting truncated to 32 bits since we lost
the proper cast to 64 bits (for PAE400
Instead of reverting that commit, use a better helper which is 32/64 bits
safe just like ARM implementation.
Fixes: 1c3c90930392 ("ARC: mm: fix build breakage with STRICT_MM_TYPECHECKS")
Cc: <stable@vger.kernel.org> #4.4+
Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
[vgupta: massaged changelog]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Apparenty this is coming in the way of gcc fix which inhibits the usage
of LP_COUNT as a gpr.
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
This came up when reviewing code to address missing IRQ affinity
setting in AXS103 platform and/or implementing hierarchical IRQ domains
- smp_ipi_irq_setup() callers pass hwirq but in turn calls
request_percpu_irq() which expects a linux virq. So invoke
irq_find_mapping() to do the conversion
(also explicitify this in code by renaming the args appropriately)
- idu_of_init()/idu_cascade_isr() were similarly using linux virq where
hwirq is expected, so do the conversion using irqd_to_hwirq() helper
Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
[vgupta: made changelog a bit concise a bit]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
The original syscall only used to return errno to indicate if cmpxchg
succeeded. It was not returning the "previous" value which typical cmpxchg
callers are interested in to build their slowpaths or retry loops.
Given user preemption in syscall return path etc, it is not wise to
check this in userspace afterwards, but should be what kernel actually
observed in the syscall.
So change the syscall interface to always return the previous value and
additionally set Z flag to indicate whether operation succeeded or not
(just like ARM implementation when they used to have this syscall)
The flag approach avoids having to put_user errno which is nice given
the use case for this syscall cares mostly about the "previous" value.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
The loop was really needed in .debug_frame regime where wanted make it
as SH_ALLOC so that apply_relocate_add() would process it. That's not
needed for .eh_frame, so we check this in apply_relocate_add() which
gets called for each section.
Note that we need to save reference to "section name strings" section in
module_frob_arch_sections() since apply_relocate_add() doesn't get that
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
The motivation is to identify ARC750 vs. ARC770 (we currently print
generic "ARC700").
A given ARC700 release could be 750 or 770, with same ARCNUM (or family
identifier which is unfortunate). The existing arc_cpu_tbl[] kept a single
concatenated string for core name and release which thus doesn't work
for 750 vs. 770 identification.
So split this into 2 tables, one with core names and other with release.
And while we are at it, get rid of the range checking for family numbers.
We just document the known to exist cores running Linux and ditch
others.
With this in place, we add detection of ARC750 which is
- cores 0x33 and before
- cores 0x34 and later with MMUv2
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
This came to light when helping a customer with oldish ARC750 core who
were getting instruction errors because of lack of SWAPE but boot log
was incorrectly printing it as being present
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
On older arc700 cores, some of the features configured were not present
in Build config registers. To print about them at boot, we just use the
Kconfig option i.e. whether linux is built to use them or not.
So yes this seems bogus, but what else can be done. Moreover if linux is
booting with these enabled, then the Kconfig info is a good indicator
anyways.
Over time these "hacks" accumulated in read_arc_build_cfg_regs() as well
as arc_cpu_mumbojumbo(). so refactor and move all of those in a single
place: read_arc_build_cfg_regs(). This causes some code redcution too:
| bloat-o-meter2 arch/arc/kernel/setup.o.0 arch/arc/kernel/setup.o.1
| add/remove: 0/0 grow/shrink: 2/1 up/down: 64/-132 (-68)
| function old new delta
| setup_processor 610 670 +60
| cpuinfo_arc700 76 80 +4
| arc_cpu_mumbojumbo 752 620 -132
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Previously we would not print the case when IOC existed but was not
enabled.
And while at it, reduce one line off boot printing by consolidating
the Peripheral address space and IO-Coherency which in a way
applies to them
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
if user disables IOC from debugger at startup (by clearing @ioc_enable),
@ioc_exists is cleared too. This means boot prints don't capture the
fact that IOC was present but disabled which could be misleading.
So invert how we use @ioc_enable and @ioc_exists and make it more
canonical. @ioc_exists represent whether hardware is present or not and
stays same whether enabled or not. @ioc_enable is still user driven,
but will be auto-disabled if IOC hardware is not present, i.e. if
@ioc_exist=0. This is opposite to what we were doing before, but much
clearer.
This means @ioc_enable is now the "exported" toggle in rest of code such
as dma mapping API.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Older ARC700 cores (ARC750 specifically) lack instructions to implement
atomic r-w-w. This is problematic for userspace libraries such as NPTL
which need atomic primitives. So enable them by providing kernel assist.
This is costly but really the only sane soluton (othern than tight
spinning using the otherwise availiable atomic exchange EX instruciton).
Good thing is there are only a few of these cores running Linux out in
the wild.
This only works on UP systems.
Reviewed-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
The cast valid since TASK_SIZE * 2 will never actually cause overflow.
| CC fs/binfmt_elf.o
| In file included from ../include/linux/elf.h:4:0,
| from ../include/linux/module.h:15,
| from ../fs/binfmt_elf.c:12:
| ../fs/binfmt_elf.c: In function load_elf_binar:
| ../arch/arc/include/asm/elf.h:57:29: warning: integer overflow in expression [-Woverflow]
| #define ELF_ET_DYN_BASE (2 * TASK_SIZE / 3)
| ^
| ../fs/binfmt_elf.c:921:16: note: in expansion of macro ELF_ET_DYN_BASE
| load_bias = ELF_ET_DYN_BASE - vaddr;
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
The IDU intc is technically part of MCIP (Multi-core IP) hence
historically was only available in a SMP hardware build (and thus only
in a SMP kernel build). Now that hardware restriction has been lifted,
so a UP kernel needs to support it.
This requires breaking mcip.c into parts which are strictly SMP
(inter-core interrupts) and IDU which in reality is just another
intc and thus has no bearing on SMP.
This change allows IDU in UP builds and with a suitable device tree, we
can have the cascaded intc system
ARCv2 core intc <---> ARCv2 IDU intc <---> periperals
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Summary of PCI changes for the v4.9 merge window:
Enumeration:
- microblaze: Add multidomain support for procfs (Bharat Kumar Gogada)
Resource management:
- Ignore requested alignment for PROBE_ONLY and fixed resources (Yongji Xie)
- Ignore requested alignment for VF BARs (Yongji Xie)
PCI device hotplug:
- Make core explicitly non-modular (Paul Gortmaker)
PCIe native device hotplug:
- Rename pcie_isr() locals for clarity (Bjorn Helgaas)
- Return IRQ_NONE when we can't read interrupt status (Bjorn Helgaas)
- Remove unnecessary guard (Bjorn Helgaas)
- Clean up dmesg "Slot(%s)" messages (Bjorn Helgaas)
- Remove useless pciehp_get_latch_status() calls (Bjorn Helgaas)
- Clear attention LED on device add (Keith Busch)
- Allow exclusive userspace control of indicators (Keith Busch)
- Process all hotplug events before looking for new ones (Mayurkumar Patel)
- Don't re-read Slot Status when queuing hotplug event (Mayurkumar Patel)
- Don't re-read Slot Status when handling surprise event (Mayurkumar Patel)
- Make explicitly non-modular (Paul Gortmaker)
Power management:
- Afford direct-complete to devices with non-standard PM (Lukas Wunner)
- Query platform firmware for device power state (Lukas Wunner)
- Recognize D3cold in pci_update_current_state() (Lukas Wunner)
- Avoid unnecessary resume after direct-complete (Lukas Wunner)
- Make explicitly non-modular (Paul Gortmaker)
Virtualization:
- Mark Atheros AR9580 to avoid bus reset (Maik Broemme)
- Check for pci_setup_device() failure in pci_iov_add_virtfn() (Po Liu)
MSI:
- Enable PCI_MSI_IRQ_DOMAIN support for ARC (Joao Pinto)
AER:
- Remove aerdriver.nosourceid kernel parameter (Bjorn Helgaas)
- Remove aerdriver.forceload kernel parameter (Bjorn Helgaas)
- Fix aer_probe() kernel-doc comment (Cao jin)
- Add bus flag to skip source ID matching (Jon Derrick)
- Avoid memory allocation in interrupt handling path (Jon Derrick)
- Cache capability position (Keith Busch)
- Make explicitly non-modular (Paul Gortmaker)
- Remove duplicate AER severity translation (Tyler Baicar)
- Send correct severity to calculate AER severity (Tyler Baicar)
Precision Time Measurement:
- Add Precision Time Measurement (PTM) support (Jonathan Yong)
- Add PTM clock granularity information (Bjorn Helgaas)
- Add pci_enable_ptm() for drivers to enable PTM on endpoints (Bjorn Helgaas)
Generic host bridge driver:
- Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)
- Make explicitly non-modular (Paul Gortmaker)
Altera host bridge driver:
- Remove redundant platform_get_resource() return value check (Bjorn Helgaas)
- Poll for link training status after retraining the link (Ley Foon Tan)
- Rework config accessors for use without a struct pci_bus (Ley Foon Tan)
- Move retrain from fixup to altera_pcie_host_init() (Ley Foon Tan)
- Make MSI explicitly non-modular (Paul Gortmaker)
- Make explicitly non-modular (Paul Gortmaker)
- Relax device number checking to allow SR-IOV (Po Liu)
ARM Versatile host bridge driver:
- Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)
Axis ARTPEC-6 host bridge driver:
- Drop __init from artpec6_add_pcie_port() (Niklas Cassel)
Freescale i.MX6 host bridge driver:
- Make explicitly non-modular (Paul Gortmaker)
Intel VMD host bridge driver:
- Add quirk for AER to ignore source ID (Jon Derrick)
- Allocate IRQ lists with correct MSI-X count (Jon Derrick)
- Convert to use pci_alloc_irq_vectors() API (Jon Derrick)
- Eliminate vmd_vector member from list type (Jon Derrick)
- Eliminate index member from IRQ list (Jon Derrick)
- Synchronize with RCU freeing MSI IRQ descs (Keith Busch)
- Request userspace control of PCIe hotplug indicators (Keith Busch)
- Move VMD driver to drivers/pci/host (Keith Busch)
Marvell Aardvark host bridge driver:
- Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)
- Remove redundant dev_err call in advk_pcie_probe() (Wei Yongjun)
Microsoft Hyper-V host bridge driver:
- Use zero-length array in struct pci_packet (Dexuan Cui)
- Use pci_function_description[0] in struct definitions (Dexuan Cui)
- Remove the unused 'wrk' in struct hv_pcibus_device (Dexuan Cui)
- Handle vmbus_sendpacket() failure in hv_compose_msi_msg() (Dexuan Cui)
- Handle hv_pci_generic_compl() error case (Dexuan Cui)
- Use list_move_tail() instead of list_del() + list_add_tail() (Wei Yongjun)
NVIDIA Tegra host bridge driver:
- Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)
- Remove redundant _data suffix (Thierry Reding)
- Use of_device_get_match_data() (Thierry Reding)
Qualcomm host bridge driver:
- Make explicitly non-modular (Paul Gortmaker)
Renesas R-Car host bridge driver:
- Consolidate register space lookup and ioremap (Bjorn Helgaas)
- Don't disable/unprepare clocks on prepare/enable failure (Geert Uytterhoeven)
- Add multi-MSI support (Grigory Kletsko)
- Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)
- Fix some checkpatch warnings (Sergei Shtylyov)
- Try increasing PCIe link speed to 5 GT/s at boot (Sergei Shtylyov)
Rockchip host bridge driver:
- Add DT bindings for Rockchip PCIe controller (Shawn Lin)
- Add Rockchip PCIe controller support (Shawn Lin)
- Improve the deassert sequence of four reset pins (Shawn Lin)
- Fix wrong transmitted FTS count (Shawn Lin)
- Increase the Max Credit update interval (Rajat Jain)
Samsung Exynos host bridge driver:
- Make explicitly non-modular (Paul Gortmaker)
ST Microelectronics SPEAr13xx host bridge driver:
- Make explicitly non-modular (Paul Gortmaker)
Synopsys DesignWare host bridge driver:
- Return data directly from dw_pcie_readl_rc() (Bjorn Helgaas)
- Exchange viewport of `MEMORYs' and `CFGs/IOs' (Dong Bo)
- Check LTSSM training bit before deciding link is up (Jisheng Zhang)
- Move link wait definitions to .c file (Joao Pinto)
- Wait for iATU enable (Joao Pinto)
- Add iATU Unroll feature (Joao Pinto)
- Fix pci_remap_iospace() failure path (Lorenzo Pieralisi)
- Make explicitly non-modular (Paul Gortmaker)
- Relax device number checking to allow SR-IOV (Po Liu)
- Keep viewport fixed for IO transaction if num_viewport > 2 (Pratyush Anand)
- Remove redundant platform_get_resource() return value check (Wei Yongjun)
TI DRA7xx host bridge driver:
- Make explicitly non-modular (Paul Gortmaker)
TI Keystone host bridge driver:
- Propagate request_irq() failure (Wei Yongjun)
Xilinx AXI host bridge driver:
- Keep both legacy and MSI interrupt domain references (Bharat Kumar Gogada)
- Clear interrupt register for invalid interrupt (Bharat Kumar Gogada)
- Clear correct MSI set bit (Bharat Kumar Gogada)
- Dispose of MSI virtual IRQ (Bharat Kumar Gogada)
- Make explicitly non-modular (Paul Gortmaker)
- Relax device number checking to allow SR-IOV (Po Liu)
Xilinx NWL host bridge driver:
- Expand error logging (Bharat Kumar Gogada)
- Enable all MSI interrupts using MSI mask (Bharat Kumar Gogada)
- Make explicitly non-modular (Paul Gortmaker)
Miscellaneous:
- Drop CONFIG_KEXEC_CORE ifdeffery (Lukas Wunner)
- portdrv: Make explicitly non-modular (Paul Gortmaker)
- Make DPC explicitly non-modular (Paul Gortmaker)"
* tag 'pci-v4.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (105 commits)
x86/PCI: VMD: Move VMD driver to drivers/pci/host
PCI: rockchip: Fix wrong transmitted FTS count
PCI: rockchip: Improve the deassert sequence of four reset pins
PCI: rockchip: Increase the Max Credit update interval
PCI: rcar: Try increasing PCIe link speed to 5 GT/s at boot
PCI/AER: Fix aer_probe() kernel-doc comment
PCI: Ignore requested alignment for VF BARs
PCI: Ignore requested alignment for PROBE_ONLY and fixed resources
PCI: Avoid unnecessary resume after direct-complete
PCI: Recognize D3cold in pci_update_current_state()
PCI: Query platform firmware for device power state
PCI: Afford direct-complete to devices with non-standard PM
PCI/AER: Cache capability position
PCI/AER: Avoid memory allocation in interrupt handling path
x86/PCI: VMD: Request userspace control of PCIe hotplug indicators
PCI: pciehp: Allow exclusive userspace control of indicators
ACPI / APEI: Send correct severity to calculate AER severity
PCI/AER: Remove duplicate AER severity translation
x86/PCI: VMD: Synchronize with RCU freeing MSI IRQ descs
x86/PCI: VMD: Eliminate index member from IRQ list
...
|
|
Commit d9676fa152c83b ("ARCv2: Enable LOCKDEP"), changed
local_save_flags() to not return raw STATUS32 but encoded in the form
such that it could be fed directly to CLRI/SETI instructions.
However the STATUS32.E[] was not captured correctly as it corresponds to
bits [4:1] in the register and not [3:0]
Fixes: d9676fa152c83b ("ARCv2: Enable LOCKDEP")
Cc: Evgeny Voevodin <evgeny.voevodin@intel.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
1. detect whether binutils supports the cfi pseudo ops
2. define conditional macros to generate the ops
3. define new ENTRY_CFI/END_CFI to annotate hand asm code.
- Needed because we don't want to emit dwarf info in general ENTRY/END
used by lowest level trap/exception/interrutp handlers as unwinder
gets confused trying to unwind out of them. We want unwinder to
instead stop when it hits onfo those routines
- These provide minimal start/end cfi ops assuming routine doesn't
touch stack memory/regs
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
We used to live with PERF_COUNT_HW_CACHE_REFERENCES and
PERF_COUNT_HW_CACHE_REFERENCES not specified on ARC.
Those events are actually aliases to 2 cache events that we do support
and so this change sets "cache-reference" and "cache-misses" events
in the same way as "L1-dcache-loads" and L1-dcache-load-misses.
And while at it adding debug info for cache events as well as doing a
subtle fix in HW events debug info - config value is much better
represented by hex so we may see not only event index but as well other
control bits set (if they exist).
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Build brekeage since last changes to generic atomic operations.
Added couple of missing macros which are now mandatory
Signed-off-by: Noam Camus <noamca@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
ARCv2 ISA provides 64-bit exclusive load/stores so use them to implement
the 64-bit atomics and elide the spinlock based generic 64-bit atomics
boot tested with atomic64 self-test (and GOD bless the person who wrote
them, I realized my inline assmebly is sloppy as hell)
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
HS release 3.0 provides for even more flexibility in specifying the
volatile address space for mapping peripherals.
With HS 2.1 @start was made flexible / programmable - with HS 3.0 even
@end can be setup (vs. fixed to 0xFFFF_FFFF before).
So add code to reflect that and while at it remove an unused struct
defintion
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull uaccess fixes from Al Viro:
"Fixes for broken uaccess primitives - mostly lack of proper zeroing
in copy_from_user()/get_user()/__get_user(), but for several
architectures there's more (broken clear_user() on frv and
strncpy_from_user() on hexagon)"
* 'uaccess-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
avr32: fix copy_from_user()
microblaze: fix __get_user()
microblaze: fix copy_from_user()
m32r: fix __get_user()
blackfin: fix copy_from_user()
sparc32: fix copy_from_user()
sh: fix copy_from_user()
sh64: failing __get_user() should zero
score: fix copy_from_user() and friends
score: fix __get_user/get_user
s390: get_user() should zero on failure
ppc32: fix copy_from_user()
parisc: fix copy_from_user()
openrisc: fix copy_from_user()
nios2: fix __get_user()
nios2: copy_from_user() should zero the tail of destination
mn10300: copy_from_user() should zero on access_ok() failure...
mn10300: failing __get_user() and get_user() should zero
mips: copy_from_user() must zero the destination on access_ok() failure
ARC: uaccess: get_user to zero out dest in cause of fault
...
|
|
Al reported potential issue with ARC get_user() as it wasn't clearing
out destination pointer in case of fault due to bad address etc.
Verified using following
| {
| u32 bogus1 = 0xdeadbeef;
| u64 bogus2 = 0xdead;
| int rc1, rc2;
|
| pr_info("Orig values %x %llx\n", bogus1, bogus2);
| rc1 = get_user(bogus1, (u32 __user *)0x40000000);
| rc2 = get_user(bogus2, (u64 __user *)0x50000000);
| pr_info("access %d %d, new values %x %llx\n",
| rc1, rc2, bogus1, bogus2);
| }
| [ARCLinux]# insmod /mnt/kernel-module/qtn.ko
| Orig values deadbeef dead
| access -14 -14, new values 0 0
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add ARC as an arch that supports PCI_MSI_IRQ_DOMAIN and add generation of
msi.h in the ARC arch.
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
|
|
| CC mm/memory.o
| In file included from ../mm/memory.c:53:0:
| ../include/linux/pfn_t.h: In function ‘pfn_t_pte’:
| ../include/linux/pfn_t.h:78:2: error: conversion to non-scalar type requested
| return pfn_pte(pfn_t_to_pfn(pfn), pgprot);
With STRICT_MM_TYPECHECKS pte_t is a struct and the offending code
forces a cast which ends up shifting a struct and hence the gcc warning.
Note that in recent past some of the arches (aarch64, s390) made
STRICT_MM_TYPECHECKS default, but we don't for ARC as this leads to slightly
worse generated code, given ARC ABI definition of returning structs
(which pte_t would become)
Quoting from ARC ABI...
"Results of type struct are returned in a caller-supplied temporary
variable whose address is passed in r0.
For such functions, the arguments are shifted so that they are
passed in r1 and up."
So
- struct to be returned would be allocated on stack requiring extra
code at call sites
- callee updates stack memory to facilitate the return (vs. simple
MOV into return reg r0)
Hence STRICT_MM_TYPECHECKS is not enabled by default for ARC
Cc: <stable@vger.kernel.org> #4.4+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
User mode callee regs are explicitly collected before signal delivery or
breakpoint trap. r25 is special for kernel as it serves as task pointer,
so user mode value is clobbered very early. It is saved in pt_regs where
generally only scratch (aka caller saved) regs are saved.
The code to access the corresponding pt_regs location had a subtle bug as
it was using load/store with scaling of offset, whereas the offset was already
byte wise correct. So fix this by replacing LD.AS with a standard LD
Cc: <stable@vger.kernel.org>
Signed-off-by: Liav Rehana <liavr@mellanox.com>
Reviewed-by: Alexey Brodkin <abrodkin@synopsys.com>
[vgupta: rewrote title and commit log]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
trace_hardirqs_on_caller() in lockdep.c expects to be called before, not
after interrupts are actually enabled.
The following comment in kernel/locking/lockdep.c substantiates this
claim:
"
/*
* We're enabling irqs and according to our state above irqs weren't
* already enabled, yet we find the hardware thinks they are in fact
* enabled.. someone messed up their IRQ state tracing.
*/
"
An example can be found in include/linux/irqflags.h:
do { trace_hardirqs_on(); raw_local_irq_enable(); } while (0)
Without this change, we hit the following DEBUG_LOCKS_WARN_ON.
[ 7.760000] ------------[ cut here ]------------
[ 7.760000] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:2711 resume_user_mode_begin+0x48/0xf0
[ 7.770000] DEBUG_LOCKS_WARN_ON(!irqs_disabled())
[ 7.780000] Modules linked in:
[ 7.780000] CPU: 0 PID: 1 Comm: init Not tainted 4.7.0-00003-gc668bb9-dirty #366
[ 7.790000]
[ 7.790000] Stack Trace:
[ 7.790000] arc_unwind_core.constprop.1+0xa4/0x118
[ 7.800000] warn_slowpath_fmt+0x72/0x158
[ 7.800000] resume_user_mode_begin+0x48/0xf0
[ 7.810000] ---[ end trace 6f6a7a8fae20d2f0 ]---
Signed-off-by: Daniel Mentz <danielmentz@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC updates from Vineet Gupta:
"Things have been calm here - nothing much except for a few fixes"
* tag 'arc-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: mm: don't loose PTE_SPECIAL in pte_modify()
ARC: dma: fix address translation in arc_dma_free
ARC: typo fix in mm/ioremap.c
ARC: fix linux-next build breakage
|
|
LTP madvise05 was generating mm splat
| [ARCLinux]# /sd/ltp/testcases/bin/madvise05
| BUG: Bad page map in process madvise05 pte:80e08211 pmd:9f7d4000
| page:9fdcfc90 count:1 mapcount:-1 mapping: (null) index:0x0 flags: 0x404(referenced|reserved)
| page dumped because: bad pte
| addr:200b8000 vm_flags:00000070 anon_vma: (null) mapping: (null) index:1005c
| file: (null) fault: (null) mmap: (null) readpage: (null)
| CPU: 2 PID: 6707 Comm: madvise05
And for newer kernels, the system was rendered unusable afterwards.
The problem was mprotect->pte_modify() clearing PTE_SPECIAL (which is
set to identify the special zero page wired to the pte).
When pte was finally unmapped, special casing for zero page was not
done, and instead it was treated as a "normal" page, tripping on the
map counts etc.
This fixes ARC STAR 9001053308
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"The locking tree was busier in this cycle than the usual pattern - a
couple of major projects happened to coincide.
The main changes are:
- implement the atomic_fetch_{add,sub,and,or,xor}() API natively
across all SMP architectures (Peter Zijlstra)
- add atomic_fetch_{inc/dec}() as well, using the generic primitives
(Davidlohr Bueso)
- optimize various aspects of rwsems (Jason Low, Davidlohr Bueso,
Waiman Long)
- optimize smp_cond_load_acquire() on arm64 and implement LSE based
atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
on arm64 (Will Deacon)
- introduce smp_acquire__after_ctrl_dep() and fix various barrier
mis-uses and bugs (Peter Zijlstra)
- after discovering ancient spin_unlock_wait() barrier bugs in its
implementation and usage, strengthen its semantics and update/fix
usage sites (Peter Zijlstra)
- optimize mutex_trylock() fastpath (Peter Zijlstra)
- ... misc fixes and cleanups"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits)
locking/atomic: Introduce inc/dec variants for the atomic_fetch_$op() API
locking/barriers, arch/arm64: Implement LDXR+WFE based smp_cond_load_acquire()
locking/static_keys: Fix non static symbol Sparse warning
locking/qspinlock: Use __this_cpu_dec() instead of full-blown this_cpu_dec()
locking/atomic, arch/tile: Fix tilepro build
locking/atomic, arch/m68k: Remove comment
locking/atomic, arch/arc: Fix build
locking/Documentation: Clarify limited control-dependency scope
locking/atomic, arch/rwsem: Employ atomic_long_fetch_add()
locking/atomic, arch/qrwlock: Employ atomic_fetch_add_acquire()
locking/atomic, arch/mips: Convert to _relaxed atomics
locking/atomic, arch/alpha: Convert to _relaxed atomics
locking/atomic: Remove the deprecated atomic_{set,clear}_mask() functions
locking/atomic: Remove linux/atomic.h:atomic_fetch_or()
locking/atomic: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
locking/atomic: Fix atomic64_relaxed() bits
locking/atomic, arch/xtensa: Implement atomic_fetch_{add,sub,and,or,xor}()
locking/atomic, arch/x86: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
locking/atomic, arch/tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
locking/atomic, arch/sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
...
|
|
__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.
pte_alloc_one_kernel uses __get_order_pte but this is obviously always
zero because BITS_FOR_PTE is not larger than 9 yet the page size is
always larger than 4K. This means that this flag has never been
actually useful here because it has always been used only for
PAGE_ALLOC_COSTLY requests.
Link: http://lkml.kernel.org/r/1464599699-30131-7-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Resolve conflict between commits:
fbffe892e525 ("locking/atomic, arch/arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()")
and:
ed6aefed726a ("Revert "ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff"")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nigel Topham <ntopham@synopsys.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-snps-arc@lists.infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Since all architectures have this implemented now natively, remove this
dead code.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.
This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-snps-arc@lists.infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This patch updates/fixes all spin_unlock_wait() implementations.
The update is in semantics; where it previously was only a control
dependency, we now upgrade to a full load-acquire to match the
store-release from the spin_unlock() we waited on. This ensures that
when spin_unlock_wait() returns, we're guaranteed to observe the full
critical section we waited on.
This fixes a number of spin_unlock_wait() users that (not
unreasonably) rely on this.
I also fixed a number of ticket lock versions to only wait on the
current lock holder, instead of for a full unlock, as this is
sufficient.
Furthermore; again for ticket locks; I added an smp_rmb() in between
the initial ticket load and the spin loop testing the current value
because I could not convince myself the address dependency is
sufficient, esp. if the loads are of different sizes.
I'm more than happy to remove this smp_rmb() again if people are
certain the address dependency does indeed work as expected.
Note: PPC32 will be fixed independently
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: chris@zankel.net
Cc: cmetcalf@mellanox.com
Cc: davem@davemloft.net
Cc: dhowells@redhat.com
Cc: james.hogan@imgtec.com
Cc: jejb@parisc-linux.org
Cc: linux@armlinux.org.uk
Cc: mpe@ellerman.id.au
Cc: ralf@linux-mips.org
Cc: realmz6@gmail.com
Cc: rkuo@codeaurora.org
Cc: rth@twiddle.net
Cc: schwidefsky@de.ibm.com
Cc: tony.luck@intel.com
Cc: vgupta@synopsys.com
Cc: ysato@users.sourceforge.jp
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
exponential backoff"
This reverts commit e78fdfef84be13a5c2b8276e12203cdf24778596.
The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL so revert the whole delayed retry series !
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
spin-wait cycle"
This reverts commit b89aa12c177477e34caa722818536fb5d0bffd76.
The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL so revert the whole delayed retry series !
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
backoff"
This reverts commit 10971638701dedadb58c88ce4d31c9375b224ed6.
The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL - so revert thw whole delayed retry
series !
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Merge updates from Andrew Morton:
- fsnotify fix
- poll() timeout fix
- a few scripts/ tweaks
- debugobjects updates
- the (small) ocfs2 queue
- Minor fixes to kernel/padata.c
- Maybe half of the MM queue
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits)
mm, page_alloc: restore the original nodemask if the fast path allocation failed
mm, page_alloc: uninline the bad page part of check_new_page()
mm, page_alloc: don't duplicate code in free_pcp_prepare
mm, page_alloc: defer debugging checks of pages allocated from the PCP
mm, page_alloc: defer debugging checks of freed pages until a PCP drain
cpuset: use static key better and convert to new API
mm, page_alloc: inline pageblock lookup in page free fast paths
mm, page_alloc: remove unnecessary variable from free_pcppages_bulk
mm, page_alloc: pull out side effects from free_pages_check
mm, page_alloc: un-inline the bad part of free_pages_check
mm, page_alloc: check multiple page fields with a single branch
mm, page_alloc: remove field from alloc_context
mm, page_alloc: avoid looking up the first zone in a zonelist twice
mm, page_alloc: shortcut watermark checks for order-0 pages
mm, page_alloc: reduce cost of fair zone allocation policy retry
mm, page_alloc: shorten the page allocator fast path
mm, page_alloc: check once if a zone has isolated pageblocks
mm, page_alloc: move __GFP_HARDWALL modifications out of the fastpath
mm, page_alloc: simplify last cpupid reset
mm, page_alloc: remove unnecessary initialisation from __alloc_pages_nodemask()
...
|
|
I've just discovered that the useful-sounding has_transparent_hugepage()
is actually an architecture-dependent minefield: on some arches it only
builds if CONFIG_TRANSPARENT_HUGEPAGE=y, on others it's also there when
not, but on some of those (arm and arm64) it then gives the wrong
answer; and on mips alone it's marked __init, which would crash if
called later (but so far it has not been called later).
Straighten this out: make it available to all configs, with a sensible
default in asm-generic/pgtable.h, removing its definitions from those
arches (arc, arm, arm64, sparc, tile) which are served by the default,
adding #define has_transparent_hugepage has_transparent_hugepage to
those (mips, powerpc, s390, x86) which need to override the default at
runtime, and removing the __init from mips (but maybe that kind of code
should be avoided after init: set a static variable the first time it's
called).
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Yang Shi <yang.shi@linaro.org>
Cc: Ning Qu <quning@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [arch/s390]
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
The default 256 bytes sometimes is just not enough.
We usually provide earlycon=... and console=... and ip=...
All this and more may need more room.
Signed-off-by: Noam Camus <noamc@ezchip.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Since the CTOP is SMT hardware multi-threaded, we need to hint
the HW that now will be a very good time to do a hardware
thread context switching. This is done by issuing the schd.rw
instruction (binary coded here so as to not require specific
revision of GCC to build the kernel).
sched.rw means that Thread becomes eligible for execution by
the threads scheduler after all pending read/write
transactions were completed.
Implementing cpu_relax_lowlatency() with barrier()
Since with current semantics of cpu_relax() it may take a
while till yielded CPU will get back.
Signed-off-by: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
|
|
With generic "identity" num of CPUs is limited to 256 (8 bit).
We use our alternative AUX register GLOBAL_ID (12 bit).
Now we can support up to 4096 CPUs.
Signed-off-by: Noam Camus <noamc@ezchip.com>
|
|
NPS device got 256 cores and each got 16 HW threads (SMT).
We use EZchip dedicated ISA to trigger HW scheduler of the
core that current HW thread belongs to.
This scheduling makes sure that data beyond barrier is available
to all HW threads in core and by that to all in device (4K).
Signed-off-by: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
|
|
We need our own implementaions since we lack LLSC support.
Our extended ISA provided with optimized solution for all 32bit
operations we see in these three headers.
Signed-off-by: Noam Camus <noamc@ezchip.com>
|
|
NPS use special mapping right below TASK_SIZE.
Hence we need to lower STACK_TOP so that user stack won't
overlap NPS special mapping.
Signed-off-by: Noam Camus <noamc@ezchip.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
|