summaryrefslogtreecommitdiffstats
path: root/arch/arm/lib/io-readsb.S
diff options
context:
space:
mode:
authorKirill A. Shutemov <kirill@shutemov.name>2009-09-15 10:26:33 +0100
committerRussell King <rmk+kernel@arm.linux.org.uk>2009-09-15 22:07:02 +0100
commitdca230f00d737353e2dffae489c916b41971921f (patch)
tree49490aab441deb87d7f9df5f0d737dad51d454fb /arch/arm/lib/io-readsb.S
parent910a17e57ab6cd22b300bde4ce5f633f175c7ccd (diff)
downloadlinux-dca230f00d737353e2dffae489c916b41971921f.tar.bz2
ARM: 5701/1: ARM: copy_page.S: take into account the size of the cache line
Optimized version of copy_page() was written with assumption that cache line size is 32 bytes. On Cortex-A8 cache line size is 64 bytes. This patch tries to generalize copy_page() to work with any cache line size if cache line size is multiple of 16 and page size is multiple of two cache line size. After this optimization we've got ~25% speedup on OMAP3(tested in userspace). There is test for kernelspace which trigger copy-on-write after fork(): #include <stdlib.h> #include <string.h> #include <unistd.h> #define BUF_SIZE (10000*4096) #define NFORK 200 int main(int argc, char **argv) { char *buf = malloc(BUF_SIZE); int i; memset(buf, 0, BUF_SIZE); for(i = 0; i < NFORK; i++) { if (fork()) { wait(NULL); } else { int j; for(j = 0; j < BUF_SIZE; j+= 4096) buf[j] = (j & 0xFF) + 1; break; } } free(buf); return 0; } Before optimization this test takes ~66 seconds, after optimization takes ~56 seconds. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@nokia.com> Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Diffstat (limited to 'arch/arm/lib/io-readsb.S')
0 files changed, 0 insertions, 0 deletions