linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2016-10-04	powerpc: tm: Enable transactional memory (TM) lazily for userspace	Cyril Bur	3	-5/+33
	Currently the MSR TM bit is always set if the hardware is TM capable. This adds extra overhead as it means the TM SPRS (TFHAR, TEXASR and TFAIR) must be swapped for each process regardless of if they use TM. For processes that don't use TM the TM MSR bit can be turned off allowing the kernel to avoid the expensive swap of the TM registers. A TM unavailable exception will occur if a thread does use TM and the kernel will enable MSR_TM and leave it so for some time afterwards. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/tm: Add TM Unavailable Exception	Cyril Bur	1	-0/+29
	If the kernel disables transactional memory (TM) and userspace still tries TM related actions (TM instructions or TM SPR accesses) TM aware hardware will cause the kernel to take a facility unavailable exception. Add checks for the exception being caused by illegal TM access in userspace. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> [mpe: Rewrite comment entirely, bugs in it are mine] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: Remove do_load_up_transact_{fpu,altivec}	Cyril Bur	3	-56/+0
	Previous rework of TM code leaves these functions unused Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: tm: Rename transct_(*) to ck(\1)_state	Cyril Bur	10	-94/+94
	Make the structures being used for checkpointed state named consistently with the pt_regs/ckpt_regs. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: tm: Always use fp_state and vr_state to store live registers	Cyril Bur	7	-381/+197
	There is currently an inconsistency as to how the entire CPU register state is saved and restored when a thread uses transactional memory (TM). Using transactional memory results in the CPU having duplicated (almost) all of its register state. This duplication results in a set of registers which can be considered 'live', those being currently modified by the instructions being executed and another set that is frozen at a point in time. On context switch, both sets of state have to be saved and (later) restored. These two states are often called a variety of different things. Common terms for the state which only exists after the CPU has entered a transaction (performed a TBEGIN instruction) in hardware are 'transactional' or 'speculative'. Between a TBEGIN and a TEND or TABORT (or an event that causes the hardware to abort), regardless of the use of TSUSPEND the transactional state can be referred to as the live state. The second state is often to referred to as the 'checkpointed' state and is a duplication of the live state when the TBEGIN instruction is executed. This state is kept in the hardware and will be rolled back to on transaction failure. Currently all the registers stored in pt_regs are ALWAYS the live registers, that is, when a thread has transactional registers their values are stored in pt_regs and the checkpointed state is in ckpt_regs. A strange opposite is true for fp_state/vr_state. When a thread is non transactional fp_state/vr_state holds the live registers. When a thread has initiated a transaction fp_state/vr_state holds the checkpointed state and transact_fp/transact_vr become the structure which holds the live state (at this point it is a transactional state). This method creates confusion as to where the live state is, in some circumstances it requires extra work to determine where to put the live state and prevents the use of common functions designed (probably before TM) to save the live state. With this patch pt_regs, fp_state and vr_state all represent the same thing and the other structures [pending rename] are for checkpointed state. Acked-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: signals: Stop using current in signal code	Cyril Bur	5	-126/+159
	Much of the signal code takes a pt_regs on which it operates. Over time the signal code has needed to know more about the thread than what pt_regs can supply, this information is obtained as needed by using 'current'. This approach is not strictly incorrect however it does mean that there is now a hard requirement that the pt_regs being passed around does belong to current, this is never checked. A safer approach is for the majority of the signal functions to take a task_struct from which they can obtain pt_regs and any other information they need. The caveat that the task_struct they are passed must be current doesn't go away but can more easily be checked for. Functions called from outside powerpc signal code are passed a pt_regs and they can confirm that the pt_regs is that of current and pass current to other functions, furthurmore, powerpc signal functions can check that the task_struct they are passed is the same as current avoiding possible corruption of current (or the task they are passed) if this assertion ever fails. CC: paulus@samba.org Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: Never giveup a reclaimed thread when enabling kernel {fp, altivec, vsx}	Cyril Bur	1	-3/+36
	After a thread is reclaimed from its active or suspended transactional state the checkpointed state exists on CPU, this state (along with the live/transactional state) has been saved in its entirety by the reclaiming process. There exists a sequence of events that would cause the kernel to call one of enable_kernel_fp(), enable_kernel_altivec() or enable_kernel_vsx() after a thread has been reclaimed. These functions save away any user state on the CPU so that the kernel can use the registers. Not only is this saving away unnecessary at this point, it is actually incorrect. It causes a save of the checkpointed state to the live structures within the thread struct thus destroying the true live state for that thread. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: Return the new MSR from msr_check_and_set()	Cyril Bur	2	-2/+4
	msr_check_and_set() always performs a mfmsr() to determine if it needs to perform an mtmsr(), as mfmsr() can be a costly operation msr_check_and_set() could return the MSR now on the CPU to avoid callers of msr_check_and_set having to make their own mfmsr() call. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: Add check_if_tm_restore_required() to giveup_all()	Cyril Bur	1	-0/+1
	giveup_all() causes FPU/VMX/VSX facilities to be disabled in a threads MSR. If the thread performing the giveup was transactional, the kernel must record which facilities were in use before the giveup as the thread must have these facilities re-enabled on return to userspace. >From process.c: /* * This is called if we are on the way out to userspace and the * TIF_RESTORE_TM flag is set. It checks if we need to reload * FP and/or vector state and does so if necessary. * If userspace is inside a transaction (whether active or * suspended) and FP/VMX/VSX instructions have ever been enabled * inside that transaction, then we have to keep them enabled * and keep the FP/VMX/VSX state loaded while ever the transaction * continues. The reason is that if we didn't, and subsequently * got a FP/VMX/VSX unavailable interrupt inside a transaction, * we don't know whether it's the same transaction, and thus we * don't know which of the checkpointed state and the transactional * state to use. */ Calling check_if_tm_restore_required() will set TIF_RESTORE_TM and save the MSR if needed. Fixes: c208505 ("powerpc: create giveup_all()") Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: Always restore FPU/VEC/VSX if hardware transactional memory in use	Cyril Bur	1	-3/+18
	Comment from arch/powerpc/kernel/process.c:967: If userspace is inside a transaction (whether active or suspended) and FP/VMX/VSX instructions have ever been enabled inside that transaction, then we have to keep them enabled and keep the FP/VMX/VSX state loaded while ever the transaction continues. The reason is that if we didn't, and subsequently got a FP/VMX/VSX unavailable interrupt inside a transaction, we don't know whether it's the same transaction, and thus we don't know which of the checkpointed state and the ransactional state to use. restore_math() restore_fp() and restore_altivec() currently may not restore the registers. It doesn't appear that this is more serious than a performance penalty. If the math registers aren't restored the userspace thread will still be run with the facility disabled. Userspace will not be able to read invalid values. On the first access it will take an facility unavailable exception and the kernel will detected an active transaction, at which point it will abort the transaction. There is the possibility for a pathological case preventing any progress by transactions, however, transactions are never guaranteed to make progress. Fixes: 70fe3d9 ("powerpc: Restore FPU/VEC/VSX if previously used") Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/powernv: Fix data type for @r in pnv_ioda_parse_m64_window()	Gavin Shan	1	-1/+1
	This fixes warning reported from sparse: pci-ioda.c:451:49: warning: incorrect type in argument 2 (different base types) Fixes: 262af557dd75 ("powerpc/powernv: Enable M64 aperatus for PHB3") Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data()	Gavin Shan	1	-2/+2
	This fixes the warnings reported from sparse: pci.c:312:33: warning: restricted __be64 degrades to integer pci.c:313:33: warning: restricted __be64 degrades to integer Fixes: cee72d5bb489 ("powerpc/powernv: Display diag data on p7ioc EEH errors") Cc: stable@vger.kernel.org # v3.3+ Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/powernv: Specify proper data type for PCI_SLOT_ID_PREFIX	Gavin Shan	1	-1/+1
	This fixes the warning reported from sparse: eeh-powernv.c:875:23: warning: constant 0x8000000000000000 is so big it is unsigned long Fixes: ebe225312739 ("powerpc/powernv: Support PCI slot ID") Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/powernv: Use CPU-endian hub diag-data type in ↵	Gavin Shan	1	-1/+1
	pnv_eeh_get_and_dump_hub_diag() The hub diag-data type is filled with big-endian data by OPAL call opal_pci_get_hub_diag_data(). We need convert it to CPU-endian value before using it. The issue is reported by sparse as pointed by Michael Ellerman: eeh-powernv.c:1309:21: warning: restricted __be16 degrades to integer This converts hub diag-data type to CPU-endian before using it in pnv_eeh_get_and_dump_hub_diag(). Fixes: 2a485ad7c88d ("powerpc/powernv: Drop PHB operation next_error()") Cc: stable@vger.kernel.org # v4.1+ Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear()	Gavin Shan	1	-1/+1
	The PE number (@frozen_pe_no), filled by opal_pci_next_error() is in big-endian format. It should be converted to CPU-endian before it is passed to opal_pci_eeh_freeze_clear() when clearing the frozen state if the PE is invalid one. As Michael Ellerman pointed out, the issue is also detected by sparse: eeh-powernv.c:1541:41: warning: incorrect type in argument 2 (different base types) This passes CPU-endian PE number to opal_pci_eeh_freeze_clear() and it should be part of commit <0f36db77643b> ("powerpc/eeh: Fix wrong printed PE number"), which was merged to 4.3 kernel. Fixes: 71b540adffd9 ("powerpc/powernv: Don't escalate non-existing frozen PE") Cc: stable@vger.kernel.org # v4.3+ Suggested-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: Set default CPU type to POWER8 for little endian builds	Anton Blanchard	1	-0/+1
	We supported POWER7 CPUs for bootstrapping little endian, but the target was always POWER8. Now that POWER7 specific issues are impacting performance, change the default target to POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian	Anton Blanchard	1	-1/+1
	POWER8 handles unaligned accesses in little endian mode, but commit 0b5e6661ac69 ("powerpc: Don't set HAVE_EFFICIENT_UNALIGNED_ACCESS on little endian builds") disabled it for all. The issue with unaligned little endian accesses is specific to POWER7, so update the Kconfig check to match. Using the stat() testcase from commit a75c380c7129 ("powerpc: Enable DCACHE_WORD_ACCESS on ppc64le"), performance improves 15% on POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: Remove static branch prediction in atomic{, 64}_add_unless	Anton Blanchard	1	-2/+2
	I see quite a lot of static branch mispredictions on a simple web serving workload. The issue is in __atomic_add_unless(), called from _atomic_dec_and_lock(). There is no obvious common case, so it is better to let the hardware predict the branch. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc: During context switch, check before setting mm_cpumask	Anton Blanchard	1	-1/+2
	During context switch, switch_mm() sets our current CPU in mm_cpumask. We can avoid this atomic sequence in most cases by checking before setting the bit. Testing on a POWER8 using our context switch microbenchmark: tools/testing/selftests/powerpc/benchmarks/context_switch \ --process --no-fp --no-altivec --no-vector Performance improves 2%. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/eeh: Quieten EEH message when no adapters are found	Anton Blanchard	1	-1/+1
	No real need for this to be pr_warn(), reduce it to pr_info(). Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/configs: Enable Intel i40e on 64 bit configs	Anton Blanchard	3	-0/+3
	We are starting to see i40e adapters in recent machines, so enable it in our configs. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/configs: Change a few things from built in to modules	Anton Blanchard	3	-21/+21
	Change a few devices and filesystems that are seldom used any more from built in to modules. This reduces our vmlinux about 500kB. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/configs: Bump kernel ring buffer size on 64 bit configs	Anton Blanchard	3	-0/+6
	When we issue a system reset, every CPU in the box prints an Oops, including a backtrace. Each of these can be quite large (over 4kB) and we may end up wrapping the ring buffer and losing important information. Bump the base size from 128kB to 256kB and the per CPU size from 4kB to 8kB. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/configs: Enable VMX crypto	Anton Blanchard	3	-0/+6
	We see big improvements with the VMX crypto functions (often 10x or more), so enable it as a module. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64: Align hot loops of memset() and backwards_memcpy()	Anton Blanchard	1	-0/+2
	Align the hot loops in our assembly implementation of memset() and backwards_memcpy(). backwards_memcpy() is called from tcp_v4_rcv(), so we might want to optimise this a little more. Signed-off-by: Anton Blanchard <anton@samba.org> Reviewed-by: Nick Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Remove unused exception code, small cleanups	Nicholas Piggin	1	-10/+20
	This was not done before the big patches because I only noticed them afterwards. It has become much easier to see which handlers are branched to from which exception vectors now, and to see exactly what vector space is being used for what. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Use a single macro for both parts of OOL exception	Nicholas Piggin	1	-37/+18
	Simple substitution. This is possible now that both parts of the OOL initial handler get linked into their correct location. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Move __replay_interrupt function below handlers	Nicholas Piggin	1	-38/+36
	This is not an exception handler as such, it's called from local_irq_enable(), not exception entry. Also clean up some now redundant comments at the end of the consolidation series. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate CBE Thermal 0x1800 interrupt	Nicholas Piggin	1	-13/+2
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Altivec 0x1700 interrupt	Nicholas Piggin	1	-9/+7
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Debug 0x1600 interrupt	Nicholas Piggin	1	-5/+3
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Softpatch 0x1500 interrupt	Nicholas Piggin	1	-35/+37
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Instruction Breakpoint 0x1300 interrupt	Nicholas Piggin	1	-4/+3
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate CBE System Error 0x1200 interrupt	Nicholas Piggin	1	-5/+3
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Reserved 0xfa0-0x1200 interrupts	Nicholas Piggin	1	-2/+1
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Hypervisor Facility Unavailable 0xf80 interrupt	Nicholas Piggin	1	-10/+6
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Facility Unavailable 0xf60 interrupt	Nicholas Piggin	1	-7/+6
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate VSX Unavailable 0xf40 interrupt	Nicholas Piggin	1	-38/+36
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Vector Unavailable 0xf20 interrupt	Nicholas Piggin	1	-39/+37
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Performance Monitor 0xf00 interrupt	Nicholas Piggin	1	-7/+6
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Reserved 0xec0, 0xee0 interrupts	Nicholas Piggin	1	-2/+2
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Hypervisor Virtualization 0xea0 interrupt	Nicholas Piggin	1	-8/+6
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Directed Hypervisor Doorbell 0xe80 interrupt	Nicholas Piggin	1	-11/+10
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Hypervisor Maintenance 0xe60 interrupt	Nicholas Piggin	1	-57/+56
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Hypervisor Emulation Assistance 0xe40 interrupt	Nicholas Piggin	1	-7/+6
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Hypervisor Instruction Storage 0xe20 interrupt	Nicholas Piggin	1	-9/+7
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Hypervisor Data Storage 0xe00 interrupt	Nicholas Piggin	1	-24/+18
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Trace 0xd00 interrupt	Nicholas Piggin	1	-4/+2
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate System Call 0xc00 interrupt	Nicholas Piggin	1	-60/+61
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04	powerpc/64s: Consolidate Reserved 0xb00 interrupt	Nicholas Piggin	1	-4/+3
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>