summaryrefslogtreecommitdiffstats
path: root/arch/arm/kernel/setup.c
diff options
context:
space:
mode:
authorRob Herring <rob.herring@calxeda.com>2012-11-29 20:39:54 +0100
committerRussell King <rmk+kernel@arm.linux.org.uk>2012-12-03 11:16:36 +0000
commit14318efb322e2fe1a034c69463d725209eb9d548 (patch)
treecf99b8a06eca7abdb89ac87c7f35bebb6b3254f7 /arch/arm/kernel/setup.c
parent3e99675af1b25a191c467700499b1cbe5585a778 (diff)
downloadlinux-14318efb322e2fe1a034c69463d725209eb9d548.tar.bz2
ARM: 7587/1: implement optimized percpu variable access
Use the previously unused TPIDRPRW register to store percpu offsets. TPIDRPRW is only accessible in PL1, so it can only be used in the kernel. This replaces 2 loads with a mrc instruction for each percpu variable access. With hackbench, the performance improvement is 1.4% on Cortex-A9 (highbank). Taking an average of 30 runs of "hackbench -l 1000" yields: Before: 6.2191 After: 6.1348 Will Deacon reported similar delta on v6 with 11MPCore. The asm "memory clobber" are needed here to ensure the percpu offset gets reloaded. Testing by Will found that this would not happen in __schedule() which is a bit of a special case as preemption is disabled but the execution can move cores. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Diffstat (limited to 'arch/arm/kernel/setup.c')
-rw-r--r--arch/arm/kernel/setup.c6
1 files changed, 6 insertions, 0 deletions
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index f739fb1d217a..9a89bf4aefe1 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -383,6 +383,12 @@ void cpu_init(void)
BUG();
}
+ /*
+ * This only works on resume and secondary cores. For booting on the
+ * boot cpu, smp_prepare_boot_cpu is called after percpu area setup.
+ */
+ set_my_cpu_offset(per_cpu_offset(cpu));
+
cpu_proc_init();
/*