summaryrefslogtreecommitdiffstats
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2020-12-04powerpc/vdso: Retrieve sigtramp offsets at buildtimeChristophe Leroy12-12/+98
This is copied from arm64. Instead of using runtime generated signal trampoline offsets, get offsets at buildtime. If the said trampoline doesn't exist, build will fail. So no need to check whether the trampoline exists or not in the VDSO. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f8bfd6812c3e3678b1cdb4d55a52f9eb022b40d3.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Remove unused \tmp param in __get_datapage()Christophe Leroy6-9/+9
The \tmp param is not used anymore, remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4b13f897dcccce8ae03c031a4598cf26b32e2f1c.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Simplify __get_datapage()Christophe Leroy3-3/+9
The VDSO datapage and the text pages are always located immediately next to each other, so it can be hardcoded without an indirection through __kernel_datapage_offset Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b08f5ef99d64cfc38f79b7ad5310d9b4d2479eeb.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Move vdso datapage up frontChristophe Leroy2-8/+8
Move the vdso datapage in front of the VDSO area, before vdso test. This will allow to remove the __kernel_datapage_offset symbol and simplify __get_datapage() in following patches. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b68c99b6e8ee0b1d99bfa4c7e34c359fc1bc1000.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Replace vdso_base by vdsoChristophe Leroy13-25/+27
All other architectures but s390 use a void pointer named 'vdso' to reference the VDSO mapping. In a following patch, the VDSO data page will be put in front of text, vdso_base will then not anymore point to VDSO text. To avoid confusion between vdso_base and VDSO text, rename vdso_base into vdso and make it a void __user *. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8e6cefe474aa4ceba028abb729485cd46c140990.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Provide vdso_remap()Christophe Leroy2-25/+24
Provide vdso_remap() through _install_special_mapping() and drop arch_remap(). This adds a test of the size and returns -EINVAL if the size is not correct. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/373c66f768fa9cc8890f3b55462209a98c522326.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Move to _install_special_mapping() and remove arch_vma_name()Christophe Leroy1-23/+19
Copied from commit 2fea7f6c98f5 ("arm64: vdso: move to _install_special_mapping and remove arch_vma_name"). Use the new _install_special_mapping() API added by commit a62c34bd2a8a ("x86, mm: Improve _install_special_mapping and fix x86 vdso naming") which obsolete install_special_mapping(). And remove arch_vma_name() as the name is handled by the new API. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: kernel test robot <lkp@intel.com> [mpe: Squash fix to use PTR_ERR_OR_ZERO() from lkp] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e7e5dfe0f93234e31051f2a610b4b07f50b0082f.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Simplify arch_setup_additional_pages() exitChristophe Leroy1-19/+21
To simplify arch_setup_additional_pages() exit, rename it __arch_setup_additional_pages() and create a caller arch_setup_additional_pages() which does the locking. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/603c1d039d3f928ee95e547fcd2219fcf4c3b514.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Use VDSO size in arch_setup_additional_pages()Christophe Leroy1-12/+6
In arch_setup_additional_pages(), instead of using number of VDSO pages and recalculate VDSO size, directly use the VDSO size. As vdso_ready is set, vdso_pages can't be 0 so just remove the test. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4edfa548c3885a430b765335dc720105716e273f.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Remove unnecessary ifdefs in vdso_pagelist initializationChristophe Leroy1-25/+6
No need of all those #ifdefs around the pagelist initialisation, use IS_ENABLED(), GCC will kick out unused static variables. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f9333432e329b1fcbbbf846cb1cd4a1c4127a60b.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Refactor 32 bits and 64 bits pages setupChristophe Leroy1-20/+19
The setup of VDSO pages is identical for 32 bits VDSO and 64 bits VDSO. Refactor that setup. And use &vdsoXX_start which is synonym of vdsoXX_kbase. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/269ffb54c37fc1d46128f77d7a39f88ef4a9957d.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Remove NULL termination element in vdso_pagelistChristophe Leroy1-4/+2
No need of a NULL last element in pagelists, install_special_mapping() knows how long the list is. Remove that element. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e58d95ab859e3cbc9bae3c9ce2959e17d2864f5d.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Remove get_page() in vdso_pagelist initializationChristophe Leroy1-4/+2
Partly copied from commit 16fb1a9bec61 ("arm64: vdso: clean up vdso_pagelist initialization"). No need to get_page() the vdso text/data - these are part of the kernel image. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9d14540bd10832b6c9519d74fb5728fdc4974b36.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Rename syscall_map_32/64 to simplify vdso_setup_syscall_map()Christophe Leroy3-15/+10
Today vdso_data structure has: - syscall_map_32[] and syscall_map_64[] on PPC64 - syscall_map_32[] on PPC32 On PPC32, syscall_map_32[] is populated using sys_call_table[]. On PPC64, syscall_map_64[] is populated using sys_call_table[] and syscal_map_32[] is populated using compat_sys_call_table[]. To simplify vdso_setup_syscall_map(), - On PPC32 rename syscall_map_32[] into syscall_map[], - On PPC64 rename syscall_map_64[] into syscall_map[], - On PPC64 rename syscall_map_32[] into compat_syscall_map[]. That way, syscall_map[] gets populated using sys_call_table[] and compat_syscall_map[] gets population using compat_sys_call_table[]. Also define an empty compat_syscall_map[] on PPC32 to avoid ifdefs. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/472734be0d9991eee320a06824219a5b2663736b.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Add missing includes and clean vdso_setup_syscall_map()Christophe Leroy1-12/+5
Instead of including extern references locally in vdso_setup_syscall_map(), add the missing headers. sys_ni_syscall() being a function, cast its address to an unsigned long instead of declaring it as a fake unsigned long object. At the same time, remove a comment which paraphrases the function name. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b4afedce748ed2858299ceab5ae29b52109263ef.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Stripped VDSO is not needed, don't build itChristophe Leroy3-43/+4
Since commit 24b659a13866 ("powerpc: Use unstripped VDSO image for more accurate profiling data"), only the unstripped VDSO image has been used. Partially revert commit 8150caad0226 ("[POWERPC] powerpc vDSO: install unstripped copies on disk") to avoid building the stripped version. And the unstripped version in $(MODLIB)/vdso/ is not required anymore as it is the one embedded in the kernel image. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5986ca25be44fe6e9790486304507f240077d8c4.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Transform save_user_regs() and save_tm_user_regs() in ↵Christophe Leroy1-113/+111
'unsafe' version Change those two functions to be used within a user access block. For that, change save_general_regs() to and unsafe_save_general_regs(), then replace all user accesses by unsafe_ versions. This series leads to a reduction from 2.55s to 1.73s of the system CPU time with the following microbench app on an mpc832x with KUAP (approx 32%) Without KUAP, the difference is in the noise. void sigusr1(int sig) { } int main(int argc, char **argv) { int i = 100000; signal(SIGUSR1, sigusr1); for (;i--;) raise(SIGUSR1); exit(0); } An additional 0.10s reduction is achieved by removing CONFIG_PPC_FPU, as the mpc832x has no FPU. A bit less spectacular on an 8xx as KUAP is less heavy, prior to the series (with KUAP) it ran in 8.10 ms. Once applies the removal of FPU regs handling, we get 7.05s. With the full series, we get 6.9s. If artificially re-activating FPU regs handling with the full series, we get 7.6s. So for the 8xx, the removal of the FPU regs copy is what makes the difference, but the rework of handle_signal also have a benefit. Same as above, without KUAP the difference is in the noise. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Fixup typo in SPE handling] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c7b37b385ccf9666066452e58f018a86573f83e8.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Isolate non-copy actions in save_user_regs() and ↵Christophe Leroy1-13/+41
save_tm_user_regs() Reorder actions in save_user_regs() and save_tm_user_regs() to regroup copies together in order to switch to user_access_begin() logic in a later patch. Move non-copy actions into new functions called prepare_save_user_regs() and prepare_save_tm_user_regs(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f6eac65781b4a57220477c8864bca2b57f29a5d5.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal: Create 'unsafe' versions of copy_[ck][fpr/vsx]_to_user()Christophe Leroy1-0/+53
For the non VSX version, that's trivial. Just use unsafe_copy_to_user() instead of __copy_to_user(). For the VSX version, remove the intermediate step through a buffer and use unsafe_put_user() directly. This generates a far smaller code which is acceptable to inline, see below: Standard VSX version: 0000000000000000 <.copy_fpr_to_user>: 0: 7c 08 02 a6 mflr r0 4: fb e1 ff f8 std r31,-8(r1) 8: 39 00 00 20 li r8,32 c: 39 24 0b 80 addi r9,r4,2944 10: 7d 09 03 a6 mtctr r8 14: f8 01 00 10 std r0,16(r1) 18: f8 21 fe 71 stdu r1,-400(r1) 1c: 39 41 00 68 addi r10,r1,104 20: e9 09 00 00 ld r8,0(r9) 24: 39 4a 00 08 addi r10,r10,8 28: 39 29 00 10 addi r9,r9,16 2c: f9 0a 00 00 std r8,0(r10) 30: 42 00 ff f0 bdnz 20 <.copy_fpr_to_user+0x20> 34: e9 24 0d 80 ld r9,3456(r4) 38: 3d 42 00 00 addis r10,r2,0 3a: R_PPC64_TOC16_HA .toc 3c: eb ea 00 00 ld r31,0(r10) 3e: R_PPC64_TOC16_LO_DS .toc 40: f9 21 01 70 std r9,368(r1) 44: e9 3f 00 00 ld r9,0(r31) 48: 81 29 00 20 lwz r9,32(r9) 4c: 2f 89 00 00 cmpwi cr7,r9,0 50: 40 9c 00 18 bge cr7,68 <.copy_fpr_to_user+0x68> 54: 4c 00 01 2c isync 58: 3d 20 40 00 lis r9,16384 5c: 79 29 07 c6 rldicr r9,r9,32,31 60: 7d 3d 03 a6 mtspr 29,r9 64: 4c 00 01 2c isync 68: 38 a0 01 08 li r5,264 6c: 38 81 00 70 addi r4,r1,112 70: 48 00 00 01 bl 70 <.copy_fpr_to_user+0x70> 70: R_PPC64_REL24 .__copy_tofrom_user 74: 60 00 00 00 nop 78: e9 3f 00 00 ld r9,0(r31) 7c: 81 29 00 20 lwz r9,32(r9) 80: 2f 89 00 00 cmpwi cr7,r9,0 84: 40 9c 00 18 bge cr7,9c <.copy_fpr_to_user+0x9c> 88: 4c 00 01 2c isync 8c: 39 20 ff ff li r9,-1 90: 79 29 00 44 rldicr r9,r9,0,1 94: 7d 3d 03 a6 mtspr 29,r9 98: 4c 00 01 2c isync 9c: 38 21 01 90 addi r1,r1,400 a0: e8 01 00 10 ld r0,16(r1) a4: eb e1 ff f8 ld r31,-8(r1) a8: 7c 08 03 a6 mtlr r0 ac: 4e 80 00 20 blr 'unsafe' simulated VSX version (The ... are only nops) using unsafe_copy_fpr_to_user() macro: unsigned long copy_fpr_to_user(void __user *to, struct task_struct *task) { unsafe_copy_fpr_to_user(to, task, failed); return 0; failed: return 1; } 0000000000000000 <.copy_fpr_to_user>: 0: 39 00 00 20 li r8,32 4: 39 44 0b 80 addi r10,r4,2944 8: 7d 09 03 a6 mtctr r8 c: 7c 69 1b 78 mr r9,r3 ... 20: e9 0a 00 00 ld r8,0(r10) 24: f9 09 00 00 std r8,0(r9) 28: 39 4a 00 10 addi r10,r10,16 2c: 39 29 00 08 addi r9,r9,8 30: 42 00 ff f0 bdnz 20 <.copy_fpr_to_user+0x20> 34: e9 24 0d 80 ld r9,3456(r4) 38: f9 23 01 00 std r9,256(r3) 3c: 38 60 00 00 li r3,0 40: 4e 80 00 20 blr ... 50: 38 60 00 01 li r3,1 54: 4e 80 00 20 blr Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/29f6c4b8e7a5bbc61e6a8801b78bbf493f9f819e.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Switch swap_context() to user_access_begin() logicChristophe Leroy1-14/+10
As this was the last user of put_sigset_t(), remove it as well. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c3ac4f2d134a3391bb51bdaa2d00e9a409aba9f8.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Add and use unsafe_put_sigset_t()Christophe Leroy1-2/+11
put_sigset_t() calls copy_to_user() for copying two words. This is terribly inefficient for copying two words. By switching to unsafe_put_user(), we end up with something as simple as: 3cc: 81 3d 00 00 lwz r9,0(r29) 3d0: 91 26 00 b4 stw r9,180(r6) 3d4: 81 3d 00 04 lwz r9,4(r29) 3d8: 91 26 00 b8 stw r9,184(r6) Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/06def97e87ac1c4ae8e3197e0982e1fab7b3c8ae.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Remove ifdefery in middle of if/elseChristophe Leroy1-14/+8
MSR_TM_ACTIVE() is always defined and returns always 0 when CONFIG_PPC_TRANSACTIONAL_MEM is not selected, so the awful ifdefery in the middle of an if/else can be removed. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f3c36d687e4228f58d5c207a4036aa9ddcc7420a.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Switch handle_rt_signal32() to user_access_begin() logicChristophe Leroy1-21/+34
On the same way as handle_signal32(), replace all user accesses with equivalent unsafe_ versions, and move the trampoline code icache flush outside the user access block. Functions that have no unsafe_ equivalent also remains outside the access block. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2974314226256f958e2984912b48883ef1754185.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Switch handle_signal32() to user_access_begin() logicChristophe Leroy1-13/+16
Replace the access_ok() by user_access_begin() and change all user accesses to unsafe_ version. Move flush_icache_range() outside the user access block. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a27797f781aa00da96f8284c898173d18e952361.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Move signal trampoline setup to handle_[rt_]signal32Christophe Leroy1-39/+22
Move signal trampoline setup into handle_signal32() and handle_rt_signal32(). At the same time, remove the define which hides the mc_pad field used for trampoline. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e439cc0fa35aa45da6776520777a61848b92fd4b.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Misc changes to make handle_[rt_]_signal32() more similarChristophe Leroy1-10/+14
Miscellaneous changes to clean and make handle_signal32() and handle_rt_signal32() even more similar. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/df0bc8c3b8fa96390c46f611df79b2a94ac21844.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Rename local pointers in handle_rt_signal32()Christophe Leroy1-26/+25
Rename pointers in handle_rt_signal32() to make it more similar to handle_signal32() tm_frame becomes tm_mctx frame becomes mctx rt_sf becomes frame Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/be77477b0f05397876015b218e36548ee8f5e10b.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Move handle_signal32() close to handle_rt_signal32()Christophe Leroy1-85/+85
Those two functions are similar and serving the same purpose. To ease refactorisation, move them close to each other. This is pure move, no code change, no cosmetic. Yes, checkpatch is not happy, most will clear later. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/dbce67900bf566bcf40179467bf1eb500814c405.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal32: Simplify logging in handle_rt_signal32()Christophe Leroy1-5/+1
If something is bad in the frame, there is no point in knowing which part of the frame exactly is wrong as it got allocated as a single block. Always print the root address of the frame in case of failed user access, just like handle_signal32(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/691895bd31fee89a2d8370befd66ad4eff5b63f2.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal: Refactor bad frame loggingChristophe Leroy4-43/+21
The logging of bad frame appears half a dozen of times and is pretty similar. Create signal_fault() fonction to perform that logging. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/fa094445c119fc00315e1c13783b493346306c6a.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal: Call get_tm_stackpointer() from get_sigframe()Christophe Leroy4-10/+11
Instead of calling get_tm_stackpointer() from the caller, call it directly from get_sigframe(). This avoids a double call and allows get_tm_stackpointer() to become static and be inlined into get_sigframe() by GCC. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/abfdc105b8b28c4eb3ab9a26297d17f302b600ea.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal: Remove get_clean_sp()Christophe Leroy2-15/+4
get_clean_sp() is only used once in kernel/signal.c . GCC is smart enough to see that x & 0xffffffff is a nop calculation on PPC32, no need of a special PPC32 trivial version. Include the logic from the PPC64 version of get_clean_sp() directly in get_sigframe(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/13ef6510ce30a4867e043157b93af5bb8c67fb3b.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal: Move access_ok() out of get_sigframe()Christophe Leroy3-7/+3
This access_ok() will soon be performed by user_access_begin(). So move it out of get_sigframe(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/900b93744732ed0887f28f5b6a40730fb04a43fa.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal: Remove BUG_ON() in handler_signal functionsChristophe Leroy2-6/+0
There is already the same BUG_ON() check in do_signal() which is the only caller of handle_rt_signal64() handle_rt_signal32() and handle_signal32(). Remove those three redundant BUG_ON(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3582e10a341d523c9c3f1ac925c3aaefc9d9293d.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/32s: Allow deselecting CONFIG_PPC_FPU on mpc832xChristophe Leroy2-2/+13
The e300c2 core which is embedded in mpc832x CPU doesn't have an FPU. Make it possible to not select CONFIG_PPC_FPU when building a kernel dedicated to that target. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/fcdc60d85baf80eaa0a7f3261d9d889282068216.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal: Don't manage floating point regs when no FPUChristophe Leroy11-3/+50
There is no point in copying floating point regs when there is no FPU and MATH_EMULATION is not selected. Create a new CONFIG_PPC_FPU_REGS bool that is selected by CONFIG_MATH_EMULATION and CONFIG_PPC_FPU, and use it to opt out everything related to fp_state in thread_struct. The asm const used only by fpu.S are opted out with CONFIG_PPC_FPU as fpu.S build is conditionnal to CONFIG_PPC_FPU. The following app spends approx 8.1 seconds system time on an 8xx without the patch, and 7.0 seconds with the patch (13.5% reduction). On an 832x, it spends approx 2.6 seconds system time without the patch and 2.1 seconds with the patch (19% reduction). void sigusr1(int sig) { } int main(int argc, char **argv) { int i = 100000; signal(SIGUSR1, sigusr1); for (;i--;) raise(SIGUSR1); exit(0); } Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7569070083e6cd5b279bb5023da601aba3c06f3c.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/ptrace: Create ptrace_get_fpr() and ptrace_put_fpr()Christophe Leroy4-29/+56
On the same model as ptrace_get_reg() and ptrace_put_reg(), create ptrace_get_fpr() and ptrace_put_fpr() to get/set the floating points registers. We move the boundary checkings in them. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/24a1baedea7f7ae7b6bf27be98bab6d01b5ca2c1.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/ptrace: Consolidate reg index calculationChristophe Leroy1-14/+4
Today we have: #ifdef CONFIG_PPC32 index = addr >> 2; if ((addr & 3) || child->thread.regs == NULL) #else index = addr >> 3; if ((addr & 7)) #endif sizeof(long) has value 4 for PPC32 and value 8 for PPC64. Dividing by 4 is equivalent to >> 2 and dividing by 8 is equivalent to >> 3. And 3 and 7 are respectively (sizeof(long) - 1). Use sizeof(long) to get rid of the #ifdef CONFIG_PPC32 and consolidate the calculation and checking. thread.regs have to be not NULL on both PPC32 and PPC64 so adding that test on PPC64 is harmless. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3cd1e284e93c60db981659585e18d1f6bb73ed2f.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/ptrace: Move declaration of ptrace_get_reg() and ptrace_set_reg()Christophe Leroy3-6/+5
ptrace_get_reg() and ptrace_set_reg() are only used internally by ptrace. Move them in arch/powerpc/kernel/ptrace/ptrace-decl.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/376c258267aeae54a4423bc4a2e107a9611f0039.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/signal: Move inline functions in signal.hChristophe Leroy2-38/+33
To really be inlined, the functions need to be defined in the same C file as the caller, or in an included header. Move functions defined inline from signal .c in signal.h Fixes: 3dd4eb83a9c0 ("powerpc: move common register copy functions from signal_32.c to signal.c") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/35b1bd44a1a66f5bcf9b457a1c480ac8d5ef50b2.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/vdso: Provide __kernel_clock_gettime64() on vdso32Christophe Leroy4-0/+18
Provides __kernel_clock_gettime64() on vdso32. This is the 64 bits version of __kernel_clock_gettime() which is y2038 compliant. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201126131006.2431205-9-mpe@ellerman.id.au
2020-12-04powerpc/vdso: Switch VDSO to generic C implementation.Christophe Leroy12-691/+106
With the C VDSO, the performance is slightly lower, but it is worth it as it will ease maintenance and evolution, and also brings clocks that are not supported with the ASM VDSO. On an 8xx at 132 MHz, vdsotest with the ASM VDSO: gettimeofday: vdso: 828 nsec/call clock-getres-realtime-coarse: vdso: 391 nsec/call clock-gettime-realtime-coarse: vdso: 614 nsec/call clock-getres-realtime: vdso: 460 nsec/call clock-gettime-realtime: vdso: 876 nsec/call clock-getres-monotonic-coarse: vdso: 399 nsec/call clock-gettime-monotonic-coarse: vdso: 691 nsec/call clock-getres-monotonic: vdso: 460 nsec/call clock-gettime-monotonic: vdso: 1026 nsec/call On an 8xx at 132 MHz, vdsotest with the C VDSO: gettimeofday: vdso: 955 nsec/call clock-getres-realtime-coarse: vdso: 545 nsec/call clock-gettime-realtime-coarse: vdso: 592 nsec/call clock-getres-realtime: vdso: 545 nsec/call clock-gettime-realtime: vdso: 941 nsec/call clock-getres-monotonic-coarse: vdso: 545 nsec/call clock-gettime-monotonic-coarse: vdso: 591 nsec/call clock-getres-monotonic: vdso: 545 nsec/call clock-gettime-monotonic: vdso: 940 nsec/call It is even better for gettime with monotonic clocks. Unsupported clocks with ASM VDSO: clock-gettime-boottime: vdso: 3851 nsec/call clock-gettime-tai: vdso: 3852 nsec/call clock-gettime-monotonic-raw: vdso: 3396 nsec/call Same clocks with C VDSO: clock-gettime-tai: vdso: 941 nsec/call clock-gettime-monotonic-raw: vdso: 1001 nsec/call clock-gettime-monotonic-coarse: vdso: 591 nsec/call On an 8321E at 333 MHz, vdsotest with the ASM VDSO: gettimeofday: vdso: 220 nsec/call clock-getres-realtime-coarse: vdso: 102 nsec/call clock-gettime-realtime-coarse: vdso: 178 nsec/call clock-getres-realtime: vdso: 129 nsec/call clock-gettime-realtime: vdso: 235 nsec/call clock-getres-monotonic-coarse: vdso: 105 nsec/call clock-gettime-monotonic-coarse: vdso: 208 nsec/call clock-getres-monotonic: vdso: 129 nsec/call clock-gettime-monotonic: vdso: 274 nsec/call On an 8321E at 333 MHz, vdsotest with the C VDSO: gettimeofday: vdso: 272 nsec/call clock-getres-realtime-coarse: vdso: 160 nsec/call clock-gettime-realtime-coarse: vdso: 184 nsec/call clock-getres-realtime: vdso: 166 nsec/call clock-gettime-realtime: vdso: 281 nsec/call clock-getres-monotonic-coarse: vdso: 160 nsec/call clock-gettime-monotonic-coarse: vdso: 184 nsec/call clock-getres-monotonic: vdso: 169 nsec/call clock-gettime-monotonic: vdso: 275 nsec/call On a Power9 Nimbus DD2.2 at 3.8GHz, with the ASM VDSO: clock-gettime-monotonic: vdso: 35 nsec/call clock-getres-monotonic: vdso: 16 nsec/call clock-gettime-monotonic-coarse: vdso: 18 nsec/call clock-getres-monotonic-coarse: vdso: 522 nsec/call clock-gettime-monotonic-raw: vdso: 598 nsec/call clock-getres-monotonic-raw: vdso: 520 nsec/call clock-gettime-realtime: vdso: 34 nsec/call clock-getres-realtime: vdso: 16 nsec/call clock-gettime-realtime-coarse: vdso: 18 nsec/call clock-getres-realtime-coarse: vdso: 517 nsec/call getcpu: vdso: 8 nsec/call gettimeofday: vdso: 25 nsec/call And with the C VDSO: clock-gettime-monotonic: vdso: 37 nsec/call clock-getres-monotonic: vdso: 20 nsec/call clock-gettime-monotonic-coarse: vdso: 21 nsec/call clock-getres-monotonic-coarse: vdso: 19 nsec/call clock-gettime-monotonic-raw: vdso: 38 nsec/call clock-getres-monotonic-raw: vdso: 20 nsec/call clock-gettime-realtime: vdso: 37 nsec/call clock-getres-realtime: vdso: 20 nsec/call clock-gettime-realtime-coarse: vdso: 20 nsec/call clock-getres-realtime-coarse: vdso: 19 nsec/call getcpu: vdso: 8 nsec/call gettimeofday: vdso: 28 nsec/call Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201126131006.2431205-8-mpe@ellerman.id.au
2020-12-04powerpc/vdso: Save and restore TOC pointer on PPC64Christophe Leroy1-0/+12
On PPC64, the TOC pointer needs to be saved and restored. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201126131006.2431205-7-mpe@ellerman.id.au
2020-12-04powerpc/vdso: Prepare for switching VDSO to generic C implementation.Christophe Leroy6-0/+260
Prepare for switching VDSO to generic C implementation in following patch. Here, we: - Prepare the helpers to call the C VDSO functions - Prepare the required callbacks for the C VDSO functions - Prepare the clocksource.h files to define VDSO_ARCH_CLOCKMODES - Add the C trampolines to the generic C VDSO functions powerpc is a bit special for VDSO as well as system calls in the way that it requires setting CR SO bit which cannot be done in C. Therefore, entry/exit needs to be performed in ASM. Implementing __arch_get_vdso_data() would clobber the link register, requiring the caller to save it. As the ASM calling function already has to set a stack frame and saves the link register before calling the C vdso function, retriving the vdso data pointer there is lighter. Implement __arch_vdso_capable() and always return true. Provide vdso_shift_ns(), as the generic x >> s gives the following bad result: 18: 35 25 ff e0 addic. r9,r5,-32 1c: 41 80 00 10 blt 2c <shift+0x14> 20: 7c 64 4c 30 srw r4,r3,r9 24: 38 60 00 00 li r3,0 ... 2c: 54 69 08 3c rlwinm r9,r3,1,0,30 30: 21 45 00 1f subfic r10,r5,31 34: 7c 84 2c 30 srw r4,r4,r5 38: 7d 29 50 30 slw r9,r9,r10 3c: 7c 63 2c 30 srw r3,r3,r5 40: 7d 24 23 78 or r4,r9,r4 In our case the shift is always <= 32. In addition, the upper 32 bits of the result are likely nul. Lets GCC know it, it also optimises the following calculations. With the patch, we get: 0: 21 25 00 20 subfic r9,r5,32 4: 7c 69 48 30 slw r9,r3,r9 8: 7c 84 2c 30 srw r4,r4,r5 c: 7d 24 23 78 or r4,r9,r4 10: 7c 63 2c 30 srw r3,r3,r5 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201126131006.2431205-6-mpe@ellerman.id.au
2020-12-04powerpc/barrier: Use CONFIG_PPC64 for barrier selectionMichael Ellerman1-1/+1
Currently we use ifdef __powerpc64__ in barrier.h to decide if we should use lwsync or eieio for SMPWMB which is then used by __smp_wmb(). That means when we are building the compat VDSO we will use eieio, because it's 32-bit code, even though we're building a 64-bit kernel for a 64-bit CPU. Although eieio should work, it would be cleaner if we always used the same barrier, even for the 32-bit VDSO. So change the ifdef to CONFIG_PPC64, so that the selection is made based on the bitness of the kernel we're building for, not the current compilation unit. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201126131006.2431205-5-mpe@ellerman.id.au
2020-12-04powerpc/time: Fix mftb()/get_tb() for use with the compat VDSOMichael Ellerman1-2/+10
When we're building the compat VDSO we are building 32-bit code but in the context of a 64-bit kernel configuration. To make this work we need to be careful in some places when using ifdefs to differentiate between CONFIG_PPC64 and __powerpc64__. CONFIG_PPC64 indicates the kernel we're building is 64-bit, but it doesn't tell us that we're currently building 64-bit code - we could be building 32-bit code for the compat VDSO. On the other hand __powerpc64__ tells us that we are currently building 64-bit code (and therefore we must also be building a 64-bit kernel). In the case of get_tb() we want to use the 32-bit code sequence regardless of whether the kernel we're building for is 64-bit or 32-bit, what matters is the word size of the current object. So we need to check __powerpc64__ to decide if we use mftb() or the mftbu()/mftb() sequence. For mftb() the logic for CPU_FTR_CELL_TB_BUG only makes sense if we're building 64-bit code, so guard that with a __powerpc64__ check. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201126131006.2431205-4-mpe@ellerman.id.au
2020-12-04powerpc/time: Move timebase functions into new asm/vdso/timebase.hChristophe Leroy4-61/+73
In order to easily use get_tb() from C VDSO, move timebase functions into a new header named asm/vdso/timebase.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201126131006.2431205-3-mpe@ellerman.id.au
2020-12-04powerpc/processor: Move cpu_relax() into asm/vdso/processor.hChristophe Leroy2-11/+25
cpu_relax() need to be in asm/vdso/processor.h to be used by the C VDSO generic library. Move it there. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201126131006.2431205-2-mpe@ellerman.id.au
2020-12-04powerpc/feature: Use CONFIG_PPC64 instead of __powerpc64__ to define ↵Christophe Leroy1-2/+2
possible features In order to build VDSO32 for PPC64, we need to have CPU_FTRS_POSSIBLE and CPU_FTRS_ALWAYS independant of whether we are building the 32 bits VDSO or the 64 bits VDSO. Use #ifdef CONFIG_PPC64 instead of #ifdef __powerpc64__ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201126131006.2431205-1-mpe@ellerman.id.au
2020-12-04powerpc: Update NUMA Kconfig description & help textMichael Ellerman1-1/+7
Update the NUMA Kconfig description to match other architectures, and add some help text. Shamelessly borrowed from x86/arm64. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20201124120547.1940635-3-mpe@ellerman.id.au