From e9a2e48e8704c9d20a625c6f2357147d03ea7b97 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Thu, 25 Feb 2021 17:17:24 -0800 Subject: drivers/base/memory: don't store phys_device in memory blocks No need to store the value for each and every memory block, as we can easily query the value at runtime. Reshuffle the members to optimize the memory layout. Also, let's clarify what the interface once was used for and why it's legacy nowadays. "phys_device" was used on s390x in older versions of lsmem[2]/chmem[3], back when they were still part of s390x-tools. They were later replaced by the variants in linux-utils. For example, RHEL6 and RHEL7 contain lsmem/chmem from s390-utils. RHEL8 switched to versions from util-linux on s390x [4]. "phys_device" was added with sysfs support for memory hotplug in commit 3947be1969a9 ("[PATCH] memory hotplug: sysfs and add/remove functions") in 2005. It always returned 0. s390x started returning something != 0 on some setups (if sclp.rzm is set by HW) in 2010 via commit 57b552ba0b2f ("memory hotplug/s390: set phys_device"). For s390x, it allowed for identifying which memory block devices belong to the same storage increment (RZM). Only if all memory block devices comprising a single storage increment were offline, the memory could actually be removed in the hypervisor. Since commit e5d709bb5fb7 ("s390/memory hotplug: provide memory_block_size_bytes() function") in 2013 a memory block device spans at least one storage increment - which is why the interface isn't really helpful/used anymore (except by old lsmem/chmem tools). There were once RFC patches to make use of "phys_device" in ACPI context; however, the underlying problem could be solved using different interfaces [1]. [1] https://patchwork.kernel.org/patch/2163871/ [2] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/lsmem [3] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/chmem [4] https://bugzilla.redhat.com/show_bug.cgi?id=1504134 Link: https://lkml.kernel.org/r/20210201181347.13262-2-david@redhat.com Signed-off-by: David Hildenbrand Acked-by: Michal Hocko Reviewed-by: Oscar Salvador Cc: Dave Hansen Cc: Greg Kroah-Hartman Cc: Gerald Schaefer Cc: Jonathan Corbet Cc: "Rafael J. Wysocki" Cc: Mauro Carvalho Chehab Cc: Ilya Dryomov Cc: Vaibhav Jain Cc: Tom Rix Cc: Geert Uytterhoeven Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/ABI/testing/sysfs-devices-memory | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'Documentation/ABI') diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory index 246a45b96d22..58dbc592bc57 100644 --- a/Documentation/ABI/testing/sysfs-devices-memory +++ b/Documentation/ABI/testing/sysfs-devices-memory @@ -26,8 +26,9 @@ Date: September 2008 Contact: Badari Pulavarty Description: The file /sys/devices/system/memory/memoryX/phys_device - is read-only and is designed to show the name of physical - memory device. Implementation is currently incomplete. + is read-only; it is a legacy interface only ever used on s390x + to expose the covered storage increment. +Users: Legacy s390-tools lsmem/chmem What: /sys/devices/system/memory/memoryX/phys_index Date: September 2008 -- cgit v1.2.3 From a89107c0478137115c6647aa28caef75513b9f40 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Thu, 25 Feb 2021 17:17:28 -0800 Subject: Documentation: sysfs/memory: clarify some memory block device properties In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory blocks as removable") we changed the output of the "removable" property of memory devices to return "1" if and only if the kernel supports memory offlining. Let's update documentation, stating that the interface is legacy. Also update documentation of the "state" property and "valid_zones" properties. Link: https://lkml.kernel.org/r/20210201181347.13262-3-david@redhat.com Signed-off-by: David Hildenbrand Acked-by: Michal Hocko Reviewed-by: Oscar Salvador Cc: Dave Hansen Cc: Jonathan Corbet Cc: David Hildenbrand Cc: Greg Kroah-Hartman Cc: Jonathan Cameron Cc: Ilya Dryomov Cc: Mauro Carvalho Chehab Cc: Geert Uytterhoeven Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/ABI/testing/sysfs-devices-memory | 53 +++++++++++++++---------- Documentation/admin-guide/mm/memory-hotplug.rst | 16 ++++---- 2 files changed, 41 insertions(+), 28 deletions(-) (limited to 'Documentation/ABI') diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory index 58dbc592bc57..d8b0f80b9e33 100644 --- a/Documentation/ABI/testing/sysfs-devices-memory +++ b/Documentation/ABI/testing/sysfs-devices-memory @@ -13,13 +13,13 @@ What: /sys/devices/system/memory/memoryX/removable Date: June 2008 Contact: Badari Pulavarty Description: - The file /sys/devices/system/memory/memoryX/removable - indicates whether this memory block is removable or not. - This is useful for a user-level agent to determine - identify removable sections of the memory before attempting - potentially expensive hot-remove memory operation + The file /sys/devices/system/memory/memoryX/removable is a + legacy interface used to indicated whether a memory block is + likely to be offlineable or not. Newer kernel versions return + "1" if and only if the kernel supports memory offlining. Users: hotplug memory remove tools http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils + lsmem/chmem part of util-linux What: /sys/devices/system/memory/memoryX/phys_device Date: September 2008 @@ -44,23 +44,25 @@ Date: September 2008 Contact: Badari Pulavarty Description: The file /sys/devices/system/memory/memoryX/state - is read-write. When read, its contents show the - online/offline state of the memory section. When written, - root can toggle the the online/offline state of a removable - memory section (see removable file description above) - using the following commands:: + is read-write. When read, it returns the online/offline + state of the memory block. When written, root can toggle + the online/offline state of a memory block using the following + commands:: # echo online > /sys/devices/system/memory/memoryX/state # echo offline > /sys/devices/system/memory/memoryX/state - For example, if /sys/devices/system/memory/memory22/removable - contains a value of 1 and - /sys/devices/system/memory/memory22/state contains the - string "online" the following command can be executed by - by root to offline that section:: - - # echo offline > /sys/devices/system/memory/memory22/state - + On newer kernel versions, advanced states can be specified + when onlining to select a target zone: "online_movable" + selects the movable zone. "online_kernel" selects the + applicable kernel zone (DMA, DMA32, or Normal). However, + after successfully setting one of the advanced states, + reading the file will return "online"; the zone information + can be obtained via "valid_zones" instead. + + While onlining is unlikely to fail, there are no guarantees + that offlining will succeed. Offlining is more likely to + succeed if "valid_zones" indicates "Movable". Users: hotplug memory remove tools http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils @@ -70,8 +72,19 @@ Date: July 2014 Contact: Zhang Zhen Description: The file /sys/devices/system/memory/memoryX/valid_zones is - read-only and is designed to show which zone this memory - block can be onlined to. + read-only. + + For online memory blocks, it returns in which zone memory + provided by a memory block is managed. If multiple zones + apply (not applicable for hotplugged memory), "None" is returned + and the memory block cannot be offlined. + + For offline memory blocks, it returns by which zone memory + provided by a memory block can be managed when onlining. + The first returned zone ("default") will be used when setting + the state of an offline memory block to "online". Only one of + the kernel zones (DMA, DMA32, Normal) is applicable for a single + memory block. What: /sys/devices/system/memoryX/nodeY Date: October 2009 diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst index 245739f55ac7..5307f90738aa 100644 --- a/Documentation/admin-guide/mm/memory-hotplug.rst +++ b/Documentation/admin-guide/mm/memory-hotplug.rst @@ -162,14 +162,14 @@ Under each memory block, you can see 5 files: which will be performed on all sections in the block. ``phys_device`` read-only: legacy interface only ever used on s390x to expose the covered storage increment. -``removable`` read-only: contains an integer value indicating - whether the memory block is removable or not - removable. A value of 1 indicates that the memory - block is removable and a value of 0 indicates that - it is not removable. A memory block is removable only if - every section in the block is removable. -``valid_zones`` read-only: designed to show which zones this memory block - can be onlined to. +``removable`` read-only: legacy interface that indicated whether a memory + block was likely to be offlineable or not. Newer kernel + versions return "1" if and only if the kernel supports + memory offlining. +``valid_zones`` read-only: designed to show by which zone memory provided by + a memory block is managed, and to show by which zone memory + provided by an offline memory block could be managed when + onlining. The first column shows it`s default zone. -- cgit v1.2.3