s390: select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP

Enable HUGETLB_PAGE_OPTIMIZE_VMEMMAP for s390. With this, vmemmap pages used to back struct pages for compound tail pages of hugetlb pages are freed and remapped to compound head page frame as RO, see also Documentation/vm/vmemmap_dedup.rst. For 1M hugetlb pages, this results in freeing 3 of 4 vmemmap pages, saving 12K of memory for each 1M hugetlb page (~1.2%). /sys/kernel/debug/kernel_page_tables will show the impact: ---[ vmemmap Area Start ]--- [...] 0x0000037202d84000-0x0000037202d85000 4K PTE RW NX 0x0000037202d85000-0x0000037202d88000 12K PTE RO NX For 2G hugetlb pages, this results in freeing 8191 of 8192 vmemmap pages, saving 32764K of memory for each 2G hugetlb page (~1.6%) /sys/kernel/debug/kernel_page_tables will show the impact: ---[ vmemmap Area Start ]--- [...] 0x000003720a000000-0x000003720a001000 4K PTE RW NX 0x000003720a001000-0x000003720c000000 32764K PTE RO NX The memory savings come with some costs: - vmemmap mapping for compound hugetlb pages is not a PMD mapping any more, but split to 4K PTE mappings, and it will not be coalesced back to PMD mapping after freeing hugetlb pages from the pool. Apart from theoretical performance impact, this will also (slightly) relativize the memory savings because of additional 2K PTE pagetable allocations. - Workload using "on the fly" hugetlb allocations via "nr_overcommit_hugepages" instead of using the hugetlb pool via "nr_hugepages" will suffer from considerably increased fault handling time, see also description from commit 78f39084b41d ("mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl"). - Freeing hugetlb pages from the pool will require re-allocation of the freed struct pages, and therefore needs some memory available to the kernel. This might fail in memory constrained scenarios. - For the same reason, memory offline might fail even for ZONE_MOVABLE when hugetlb pages are present (but not for s390, since we do not support ARCH_ENABLE_HUGEPAGE_MIGRATION, and therefore cannot have hugetlb pages in ZONE_MOVABLE). - General increased complexity and overhead in kernel handling of compound (head) pages. Therefore, this feature is disabled by default, and has to be enabled explicitly either by adding "hugetlb_free_vmemmap=on" kernel parameter, or during run-time via "/proc/sys/vm/hugetlb_optimize_vmemmap" sysctl. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
author: Gerald Schaefer <gerald.schaefer@linux.ibm.com> 2022-07-19 11:08:37 +0200
committer: Alexander Gordeev <agordeev@linux.ibm.com> 2022-11-10 08:00:41 +0100
commit: 00a34d5a99c0631bd780b14cbe3813d0b39c3886 (patch)
tree: d1fa7d7fba7bc82fa8e0b878f977b116c9246131
parent: 58354c7d35d35dd119ada18ff84a6686ccc8743f (diff)
download: linux-00a34d5a99c0631bd780b14cbe3813d0b39c3886.tar.bz2
1 files changed, 1 insertions, 0 deletions
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 318fce77601d..a006dbb44890 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -121,6 +121,7 @@ config S390
 	select ARCH_WANTS_NO_INSTR
 	select ARCH_WANT_DEFAULT_BPF_JIT
 	select ARCH_WANT_IPC_PARSE_VERSION
+	select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
 	select BUILDTIME_TABLE_SORT
 	select CLONE_BACKWARDS2
 	select DMA_OPS if PCI
author	Gerald Schaefer <gerald.schaefer@linux.ibm.com>	2022-07-19 11:08:37 +0200
committer	Alexander Gordeev <agordeev@linux.ibm.com>	2022-11-10 08:00:41 +0100
commit	00a34d5a99c0631bd780b14cbe3813d0b39c3886 (patch)
tree	d1fa7d7fba7bc82fa8e0b878f977b116c9246131
parent	58354c7d35d35dd119ada18ff84a6686ccc8743f (diff)
download	linux-00a34d5a99c0631bd780b14cbe3813d0b39c3886.tar.bz2