summaryrefslogtreecommitdiffstats
path: root/drivers/nvdimm
AgeCommit message (Collapse)AuthorFilesLines
2020-08-07Merge tag 'powerpc-5.9-1' of ↵Linus Torvalds2-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Add support for (optionally) using queued spinlocks & rwlocks. - Support for a new faster system call ABI using the scv instruction on Power9 or later. - Drop support for the PROT_SAO mmap/mprotect flag as it will be unsupported on Power10 and future processors, leaving us with no way to implement the functionality it requests. This risks breaking userspace, though we believe it is unused in practice. - A bug fix for, and then the removal of, our custom stack expansion checking. We now allow stack expansion up to the rlimit, like other architectures. - Remove the remnants of our (previously disabled) topology update code, which tried to react to NUMA layout changes on virtualised systems, but was prone to crashes and other problems. - Add PMU support for Power10 CPUs. - A change to our signal trampoline so that we don't unbalance the link stack (branch return predictor) in the signal delivery path. - Lots of other cleanups, refactorings, smaller features and so on as usual. Thanks to: Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan S, Bharata B Rao, Bill Wendling, Bin Meng, Cédric Le Goater, Chris Packham, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Dan Williams, David Lamparter, Desnes A. Nunes do Rosario, Erhard F., Finn Thain, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geoff Levand, Greg Kurz, Gustavo A. R. Silva, Hari Bathini, Harish, Imre Kaloz, Joel Stanley, Joe Perches, John Crispin, Jordan Niethe, Kajol Jain, Kamalesh Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li RongQing, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal Suchanek, Milton Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver O'Halloran, Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud, Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar Dronamraju, Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza Cascardo, Thiago Jung Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov, Wei Yongjun, Wen Xiong, YueHaibing. * tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (337 commits) selftests/powerpc: Fix pkey syscall redefinitions powerpc: Fix circular dependency between percpu.h and mmu.h powerpc/powernv/sriov: Fix use of uninitialised variable selftests/powerpc: Skip vmx/vsx/tar/etc tests on older CPUs powerpc/40x: Fix assembler warning about r0 powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric powerpc/papr_scm: Fetch nvdimm performance stats from PHYP cpuidle: pseries: Fixup exit latency for CEDE(0) cpuidle: pseries: Add function to parse extended CEDE records cpuidle: pseries: Set the latency-hint before entering CEDE selftests/powerpc: Fix online CPU selection powerpc/perf: Consolidate perf_callchain_user_[64|32]() powerpc/pseries/hotplug-cpu: Remove double free in error path powerpc/pseries/mobility: Add pr_debug() for device tree changes powerpc/pseries/mobility: Set pr_fmt() powerpc/cacheinfo: Warn if cache object chain becomes unordered powerpc/cacheinfo: Improve diagnostics about malformed cache lists powerpc/cacheinfo: Use name@unit instead of full DT path in debug messages powerpc/cacheinfo: Set pr_fmt() powerpc: fix function annotations to avoid section mismatch warnings with gcc-10 ...
2020-08-03Merge tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-blockLinus Torvalds3-6/+9
Pull core block updates from Jens Axboe: "Good amount of cleanups and tech debt removals in here, and as a result, the diffstat shows a nice net reduction in code. - Softirq completion cleanups (Christoph) - Stop using ->queuedata (Christoph) - Cleanup bd claiming (Christoph) - Use check_events, moving away from the legacy media change (Christoph) - Use inode i_blkbits consistently (Christoph) - Remove old unused writeback congestion bits (Christoph) - Cleanup/unify submission path (Christoph) - Use bio_uninit consistently, instead of bio_disassociate_blkg (Christoph) - sbitmap cleared bits handling (John) - Request merging blktrace event addition (Jan) - sysfs add/remove race fixes (Luis) - blk-mq tag fixes/optimizations (Ming) - Duplicate words in comments (Randy) - Flush deferral cleanup (Yufen) - IO context locking/retry fixes (John) - struct_size() usage (Gustavo) - blk-iocost fixes (Chengming) - blk-cgroup IO stats fixes (Boris) - Various little fixes" * tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block: (135 commits) block: blk-timeout: delete duplicated word block: blk-mq-sched: delete duplicated word block: blk-mq: delete duplicated word block: genhd: delete duplicated words block: elevator: delete duplicated word and fix typos block: bio: delete duplicated words block: bfq-iosched: fix duplicated word iocost_monitor: start from the oldest usage index iocost: Fix check condition of iocg abs_vdebt block: Remove callback typedefs for blk_mq_ops block: Use non _rcu version of list functions for tag_set_list blk-cgroup: show global disk stats in root cgroup io.stat blk-cgroup: make iostat functions visible to stat printing block: improve discard bio alignment in __blkdev_issue_discard() block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers block: defer flush request no matter whether we have elevator block: make blk_timeout_init() static block: remove retry loop in ioc_release_fn() block: remove unnecessary ioc nested locking block: integrate bd_start_claiming into __blkdev_get ...
2020-07-16powerpc/pmem: Initialize pmem device on newer hardwareAneesh Kumar K.V1-0/+1
With kernel now supporting new pmem flush/sync instructions, we can now enable the kernel to initialize the device. On P10 these devices would appear with a new compatible string. For PAPR device we have compatible "ibm,pmemory-v2" and for OF pmem device we have compatible "pmem-region-v2" Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200701072235.223558-8-aneesh.kumar@linux.ibm.com
2020-07-16libnvdimm/nvdimm/flush: Allow architecture to override the flush barrierAneesh Kumar K.V1-4/+4
Architectures like ppc64 provide persistent memory specific barriers that will ensure that all stores for which the modifications are written to persistent storage by preceding dcbfps and dcbstps instructions have updated persistent storage before any data access or data transfer caused by subsequent instructions is initiated. This is in addition to the ordering done by wmb() Update nvdimm core such that architecture can use barriers other than wmb to ensure all previous writes are architecturally visible for the platform buffer flush. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200701072235.223558-5-aneesh.kumar@linux.ibm.com
2020-07-08libnvdimm/security: Fix key lookup permissionsDan Williams1-1/+1
As of commit 8c0637e950d6 ("keys: Make the KEY_NEED_* perms an enum rather than a mask") lookup_user_key() needs an explicit declaration of what it wants to do with the key. Add KEY_NEED_SEARCH to fix a warning with the below signature, and fixes the inability to retrieve a key. WARNING: CPU: 15 PID: 6276 at security/keys/permission.c:35 key_task_permission+0xd3/0x140 [..] RIP: 0010:key_task_permission+0xd3/0x140 [..] Call Trace: lookup_user_key+0xeb/0x6b0 ? vsscanf+0x3df/0x840 ? key_validate+0x50/0x50 ? key_default_cmp+0x20/0x20 nvdimm_get_user_key_payload.part.0+0x21/0x110 [libnvdimm] nvdimm_security_store+0x67d/0xb20 [libnvdimm] security_store+0x67/0x1a0 [libnvdimm] kernfs_fop_write+0xcf/0x1c0 vfs_write+0xde/0x1d0 ksys_write+0x68/0xe0 do_syscall_64+0x5c/0xa0 entry_SYSCALL_64_after_hwframe+0x49/0xb3 Fixes: 8c0637e950d6 ("keys: Make the KEY_NEED_* perms an enum rather than a mask") Suggested-by: David Howells <dhowells@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/159297332630.1304143.237026690015653759.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-07-01block: move ->make_request_fn to struct block_device_operationsChristoph Hellwig3-6/+9
The make_request_fn is a little weird in that it sits directly in struct request_queue instead of an operation vector. Replace it with a block_device_operations method called submit_bio (which describes much better what it does). Also remove the request_queue argument to it, as the queue can be derived pretty trivially from the bio. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-17nvdimm/region: always show the 'align' attributeVishal Verma1-12/+2
It is possible that a platform that is capable of 'namespace labels' comes up without the labels properly initialized. In this case, the region's 'align' attribute is hidden. Howerver, once the user does initialize he labels, the 'align' attribute still stays hidden, which is unexpected. The sysfs_update_group() API is meant to address this, and could be called during region probe, but it has entanglements with the device 'lockdep_mutex'. Therefore, simply make the 'align' attribute always visible. It doesn't matter what it says for label-less namespaces, since it is not possible to change their allocation anyway. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20200520225026.29426-1-vishal.l.verma@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-06-13Merge tag 'libnvdimm-for-5.8' of ↵Linus Torvalds3-8/+6
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "Small collection of cleanups to rework usage of ->queuedata and the GUID api" * tag 'libnvdimm-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm/pmem: stop using ->queuedata nvdimm/btt: stop using ->queuedata nvdimm/blk: stop using ->queuedata libnvdimm: Replace guid_copy() with import_guid() where it makes sense
2020-06-08asm-generic: don't include <linux/mm.h> in cacheflush.hChristoph Hellwig1-1/+2
This seems to lead to some crazy include loops when using asm-generic/cacheflush.h on more architectures, so leave it to the arch header for now. [hch@lst.de: fix warning] Link: http://lkml.kernel.org/r/20200520173520.GA11199@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Will Deacon <will@kernel.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/20200515143646.3857579-7-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-27nvdimm: use bio_{start,end}_io_acctChristoph Hellwig4-25/+12
Switch dm to use the nicer bio accounting helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-13nvdimm/pmem: stop using ->queuedataChristoph Hellwig1-3/+3
In preparation for removing queuedata as an argument to make_request_fn() drop the dependency ->queuedata. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20200508161517.252308-16-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-05-13nvdimm/btt: stop using ->queuedataChristoph Hellwig1-2/+1
In preparation for removing queuedata as an argument to make_request_fn() drop the dependency ->queuedata. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20200508161517.252308-15-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-05-13nvdimm/blk: stop using ->queuedataChristoph Hellwig1-3/+2
In preparation for removing queuedata as an argument to make_request_fn() drop the dependency ->queuedata. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20200508161517.252308-14-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-04-08Merge tag 'libnvdimm-for-5.7' of ↵Linus Torvalds12-103/+342
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm and dax updates from Dan Williams: "There were multiple touches outside of drivers/nvdimm/ this round to add cross arch compatibility to the devm_memremap_pages() interface, enhance numa information for persistent memory ranges, and add a zero_page_range() dax operation. This cycle I switched from the patchwork api to Konstantin's b4 script for collecting tags (from x86, PowerPC, filesystem, and device-mapper folks), and everything looks to have gone ok there. This has all appeared in -next with no reported issues. Summary: - Add support for region alignment configuration and enforcement to fix compatibility across architectures and PowerPC page size configurations. - Introduce 'zero_page_range' as a dax operation. This facilitates filesystem-dax operation without a block-device. - Introduce phys_to_target_node() to facilitate drivers that want to know resulting numa node if a given reserved address range was onlined. - Advertise a persistence-domain for of_pmem and papr_scm. The persistence domain indicates where cpu-store cycles need to reach in the platform-memory subsystem before the platform will consider them power-fail protected. - Promote numa_map_to_online_node() to a cross-kernel generic facility. - Save x86 numa information to allow for node-id lookups for reserved memory ranges, deploy that capability for the e820-pmem driver. - Pick up some miscellaneous minor fixes, that missed v5.6-final, including a some smatch reports in the ioctl path and some unit test compilation fixups. - Fixup some flexible-array declarations" * tag 'libnvdimm-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (29 commits) dax: Move mandatory ->zero_page_range() check in alloc_dax() dax,iomap: Add helper dax_iomap_zero() to zero a range dax: Use new dax zero page method for zeroing a page dm,dax: Add dax zero_page_range operation s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver dax, pmem: Add a dax operation zero_page_range pmem: Add functions for reading/writing page to/from pmem libnvdimm: Update persistence domain value for of_pmem and papr_scm device tools/test/nvdimm: Fix out of tree build libnvdimm/region: Fix build error libnvdimm/region: Replace zero-length array with flexible-array member libnvdimm/label: Replace zero-length array with flexible-array member ACPI: NFIT: Replace zero-length array with flexible-array member libnvdimm/region: Introduce an 'align' attribute libnvdimm/region: Introduce NDD_LABELING libnvdimm/namespace: Enforce memremap_compat_align() libnvdimm/pfn: Prevent raw mode fallback if pfn-infoblock valid libnvdimm: Out of bounds read in __nd_ioctl() acpi/nfit: improve bounds checking for 'func' mm/memremap_pages: Introduce memremap_compat_align() ...
2020-04-02Merge branch 'for-5.7/libnvdimm' into libnvdimm-for-nextDan Williams4-42/+69
- Introduce 'zero_page_range' as a dax operation. This facilitates filesystem-dax operation without a block-device. - Advertise a persistence-domain for of_pmem and papr_scm. The persistence domain indicates where cpu-store cycles need to reach in the platform-memory subsystem before the platform will consider them power-fail protected. - Fixup some flexible-array declarations.
2020-04-02Merge branch 'for-5.7/numa' into libnvdimm-for-nextDan Williams1-14/+4
- Promote numa_map_to_online_node() to a cross-kernel generic facility. - Save x86 numa information to allow for node-id lookups for reserved memory ranges, deploy that capability for the e820-pmem driver. - Introduce phys_to_target_node() to facilitate drivers that want to know resulting numa node if a given reserved address range was onlined.
2020-04-02Merge branch 'for-5.6/libnvdimm-fixes' into libnvdimm-for-nextDan Williams1-2/+4
Pick up some miscellaneous minor fixes, that missed v5.6-final, including a some smatch reports in the ioctl path and some unit test compilation fixups.
2020-04-02dax: Move mandatory ->zero_page_range() check in alloc_dax()Vivek Goyal1-2/+2
zero_page_range() dax operation is mandatory for dax devices. Right now that check happens in dax_zero_page_range() function. Dan thinks that's too late and its better to do the check earlier in alloc_dax(). I also modified alloc_dax() to return pointer with error code in it in case of failure. Right now it returns NULL and caller assumes failure happened due to -ENOMEM. But with this ->zero_page_range() check, I need to return -EINVAL instead. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20200401161125.GB9398@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-04-02dax, pmem: Add a dax operation zero_page_rangeVivek Goyal1-0/+11
Add a dax operation zero_page_range, to zero a page. This will also clear any known poison in the page being zeroed. As of now, zeroing of one page is allowed in a single call. There are no callers which are trying to zero more than a page in a single call. Once we grow the callers which zero more than a page in single call, we can add that support. Primary reason for not doing that yet is that this will add little complexity in dm implementation where a range might be spanning multiple underlying targets and one will have to split the range into multiple sub ranges and call zero_page_range() on individual targets. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Link: https://lore.kernel.org/r/20200228163456.1587-3-vgoyal@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-04-02pmem: Add functions for reading/writing page to/from pmemVivek Goyal1-36/+50
This splits pmem_do_bvec() into pmem_do_read() and pmem_do_write(). pmem_do_write() will be used by pmem zero_page_range() as well. Hence sharing the same code. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Link: https://lore.kernel.org/r/20200228163456.1587-2-vgoyal@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-31libnvdimm: Update persistence domain value for of_pmem and papr_scm deviceAneesh Kumar K.V1-1/+3
Currently, kernel shows the below values "persistence_domain":"cpu_cache" "persistence_domain":"memory_controller" "persistence_domain":"unknown" "cpu_cache" indicates no extra instructions is needed to ensure the persistence of data in the pmem media on power failure. "memory_controller" indicates cpu cache flush instructions are required to flush the data. Platform provides mechanisms to automatically flush outstanding write data from memory controler to pmem on system power loss. Based on the above use memory_controller for non volatile regions on ppc64. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/20200324034821.60869-1-aneesh.kumar@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-31libnvdimm/region: Fix build errorYueHaibing1-5/+3
On CONFIG_PPC32=y build fails: drivers/nvdimm/region_devs.c:1034:14: note: in expansion of macro ‘do_div’ remainder = do_div(per_mapping, mappings); ^~~~~~ In file included from ./arch/powerpc/include/generated/asm/div64.h:1:0, from ./include/linux/kernel.h:18, from ./include/asm-generic/bug.h:19, from ./arch/powerpc/include/asm/bug.h:109, from ./include/linux/bug.h:5, from ./include/linux/scatterlist.h:7, from drivers/nvdimm/region_devs.c:5: ./include/asm-generic/div64.h:243:22: error: passing argument 1 of ‘__div64_32’ from incompatible pointer type [-Werror=incompatible-pointer-types] __rem = __div64_32(&(n), __base); \ Use div_u64 instead of do_div to fix this. Fixes: 2522afb86a8c ("libnvdimm/region: Introduce an 'align' attribute") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20200331115024.31628-1-yuehaibing@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-30libnvdimm/region: Replace zero-length array with flexible-array memberGustavo A. R. Silva1-2/+2
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200319230937.GA16648@embeddedor.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-30libnvdimm/label: Replace zero-length array with flexible-array memberGustavo A. R. Silva1-1/+1
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200319230737.GA16452@embeddedor.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-27block: simplify queue allocationChristoph Hellwig3-6/+3
Current make_request based drivers use either blk_alloc_queue_node or blk_alloc_queue to allocate a queue, and then set up the make_request_fn function pointer and a few parameters using the blk_queue_make_request helper. Simplify this by passing the make_request pointer to blk_alloc_queue, and while at it merge the _node variant into the main helper by always passing a node_id, and remove the superfluous gfp_mask parameter. A lower-level __blk_alloc_queue is kept for the blk-mq case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-17libnvdimm/region: Introduce an 'align' attributeDan Williams4-26/+192
The align attribute applies an alignment constraint for namespace creation in a region. Whereas the 'align' attribute of a namespace applied alignment padding via an info block, the 'align' attribute applies alignment constraints to the free space allocation. The default for 'align' is the maximum known memremap_compat_align() across all archs (16MiB from PowerPC at time of writing) multiplied by the number of interleave ways if there is blk-aliasing. The minimum is PAGE_SIZE and allows for the creation of cross-arch incompatible namespaces, just as previous kernels allowed, but the expectation is cross-arch and mode-independent compatibility by default. The regression risk with this change is limited to cases that were dependent on the ability to create unaligned namespaces, *and* for some reason are unable to opt-out of aligned namespaces by writing to 'regionX/align'. If such a scenario arises the default can be flipped from opt-out to opt-in of compat-aligned namespace creation, but that is a last resort. The kernel will otherwise continue to support existing defined misaligned namespaces. Unfortunately this change needs to touch several parts of the implementation at once: - region/available_size: expand busy extents to current align - region/max_available_extent: expand busy extents to current align - namespace/size: trim free space to current align ...to keep the free space accounting conforming to the dynamic align setting. Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Link: https://lore.kernel.org/r/158041478371.3889308.14542630147672668068.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-17libnvdimm/region: Introduce NDD_LABELINGDan Williams5-12/+13
The NDD_ALIASING flag is used to indicate where pmem capacity might alias with blk capacity and require labeling. It is also used to indicate whether the DIMM supports labeling. Separate this latter capability into its own flag so that the NDD_ALIASING flag is scoped to true aliased configurations. To my knowledge aliased configurations only exist in the ACPI spec, there are no known platforms that ship this support in production. This clarity allows namespace-capacity alignment constraints around interleave-ways to be relaxed. Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/158041477856.3889308.4212605617834097674.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-17libnvdimm/namespace: Enforce memremap_compat_align()Dan Williams3-3/+58
The pmem driver on PowerPC crashes with the following signature when instantiating misaligned namespaces that map their capacity via memremap_pages(). BUG: Unable to handle kernel data access at 0xc001000406000000 Faulting instruction address: 0xc000000000090790 NIP [c000000000090790] arch_add_memory+0xc0/0x130 LR [c000000000090744] arch_add_memory+0x74/0x130 Call Trace: arch_add_memory+0x74/0x130 (unreliable) memremap_pages+0x74c/0xa30 devm_memremap_pages+0x3c/0xa0 pmem_attach_disk+0x188/0x770 nvdimm_bus_probe+0xd8/0x470 With the assumption that only memremap_pages() has alignment constraints, enforce memremap_compat_align() for pmem_should_map_pages(), nd_pfn, and nd_dax cases. This includes preventing the creation of namespaces where the base address is misaligned and cases there infoblock padding parameters are invalid. Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Jeff Moyer <jmoyer@redhat.com> Fixes: a3619190d62e ("libnvdimm/pfn: stop padding pmem namespaces to section alignment") Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-17libnvdimm/pfn: Prevent raw mode fallback if pfn-infoblock validDan Williams1-4/+4
The EOPNOTSUPP return code from the pmem driver indicates that the namespace has a configuration that may be valid, but the current kernel does not support it. Expand this to all of the nd_pfn_validate() error conditions after the infoblock has been verified as self consistent. This prevents exposing the namespace to I/O when the infoblock needs to be corrected, or the system needs to be put into a different configuration (like changing the page size on PowerPC). Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-02-28libnvdimm: Out of bounds read in __nd_ioctl()Dan Carpenter1-2/+4
The "cmd" comes from the user and it can be up to 255. It it's more than the number of bits in long, it results out of bounds read when we check test_bit(cmd, &cmd_mask). The highest valid value for "cmd" is ND_CMD_CALL (10) so I added a compare against that. Fixes: 62232e45f4a2 ("libnvdimm: control (ioctl) messages for nvdimm_bus and nvdimm devices") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20200225162055.amtosfy7m35aivxg@kili.mountain Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-02-20mm/memremap_pages: Introduce memremap_compat_align()Dan Williams1-1/+1
The "sub-section memory hotplug" facility allows memremap_pages() users like libnvdimm to compensate for hardware platforms like x86 that have a section size larger than their hardware memory mapping granularity. The compensation that sub-section support affords is being tolerant of physical memory resources shifting by units smaller (64MiB on x86) than the memory-hotplug section size (128 MiB). Where the platform physical-memory mapping granularity is limited by the number and capability of address-decode-registers in the memory controller. While the sub-section support allows memremap_pages() to operate on sub-section (2MiB) granularity, the Power architecture may still require 16MiB alignment on "!radix_enabled()" platforms. In order for libnvdimm to be able to detect and manage this per-arch limitation, introduce memremap_compat_align() as a common minimum alignment across all driver-facing memory-mapping interfaces, and let Power override it to 16MiB in the "!radix_enabled()" case. The assumption / requirement for 16MiB to be a viable memremap_compat_align() value is that Power does not have platforms where its equivalent of address-decode-registers never hardware remaps a persistent memory resource on smaller than 16MiB boundaries. Note that I tried my best to not add a new Kconfig symbol, but header include entanglements defeated the #ifndef memremap_compat_align design pattern and the need to export it defeats the __weak design pattern for arch overrides. Based on an initial patch by Aneesh. Link: http://lore.kernel.org/r/CAPcyv4gBGNP95APYaBcsocEa50tQj9b5h__83vgngjq3ouGX_Q@mail.gmail.com Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: Jeff Moyer <jmoyer@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-02-18libnvdimm/e820: Retrieve and populate correct 'target_node' infoDan Williams1-14/+4
Use the new phys_to_target_node() and numa_map_to_online_node() helpers to retrieve the correct id for the 'numa_node' ("local" / online initiator node) and 'target_node' (offline target memory node) sysfs attributes. Below is an example from a 4 NUMA node system where all the memory on node2 is pmem / reserved. It should be noted that with the arrival of the ACPI HMAT table and EFI Specific Purpose Memory the kernel will start to see more platforms with reserved / performance differentiated memory in its own NUMA node. Hence all the stakeholders on the Cc for what is ostensibly a libnvdimm local patch. === Before === /* Notice no online memory on node2 at start */ # numactl --hardware available: 3 nodes (0-1,3) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 node 0 size: 3958 MB node 0 free: 3708 MB node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 1 size: 4027 MB node 1 free: 3871 MB node 3 cpus: node 3 size: 3994 MB node 3 free: 3971 MB node distances: node 0 1 3 0: 10 21 21 1: 21 10 21 3: 21 21 10 /* * Put the pmem namespace into devdax mode so it can be assigned to the * kmem driver */ # ndctl create-namespace -e namespace0.0 -m devdax -f { "dev":"namespace0.0", "mode":"devdax", "map":"dev", "size":"3.94 GiB (4.23 GB)", "uuid":"1650af9b-9ba3-4704-acd6-10178399d9a3", [..] } /* Online Persistent Memory as System RAM */ # daxctl reconfigure-device --mode=system-ram dax0.0 libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success [ { "chardev":"dax0.0", "size":4225761280, "target_node":0, "mode":"system-ram" } ] reconfigured 1 device /* Note that the memory is onlined by default to the wrong node, node0 */ # numactl --hardware available: 3 nodes (0-1,3) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 node 0 size: 7926 MB node 0 free: 7655 MB node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 1 size: 4027 MB node 1 free: 3871 MB node 3 cpus: node 3 size: 3994 MB node 3 free: 3971 MB node distances: node 0 1 3 0: 10 21 21 1: 21 10 21 3: 21 21 10 === After === /* Notice that the "phys_index" error messages are gone */ # daxctl reconfigure-device --mode=system-ram dax0.0 [ { "chardev":"dax0.0", "size":4225761280, "target_node":2, "mode":"system-ram" } ] reconfigured 1 device /* Notice that node2 is now correctly populated */ # numactl --hardware available: 4 nodes (0-3) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 node 0 size: 3958 MB node 0 free: 3793 MB node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 1 size: 4027 MB node 1 free: 3851 MB node 2 cpus: node 2 size: 3968 MB node 2 free: 3968 MB node 3 cpus: node 3 size: 3994 MB node 3 free: 3908 MB node distances: node 0 1 2 3 0: 10 21 21 21 1: 21 10 21 21 2: 21 21 10 21 3: 21 21 21 10 Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/158188327614.894464.13122730362187722603.stgit@dwillia2-desk3.amr.corp.intel.com
2020-01-31mm: Cleanup __put_devmap_managed_page() vs ->page_free()Dan Williams1-6/+0
After the removal of the device-public infrastructure there are only 2 ->page_free() call backs in the kernel. One of those is a device-private callback in the nouveau driver, the other is a generic wakeup needed in the DAX case. In the hopes that all ->page_free() callbacks can be migrated to common core kernel functionality, move the device-private specific actions in __put_devmap_managed_page() under the is_device_private_page() conditional, including the ->page_free() callback. For the other page types just open-code the generic wakeup. Yes, the wakeup is only needed in the MEMORY_DEVICE_FSDAX case, but it does no harm in the MEMORY_DEVICE_DEVDAX and MEMORY_DEVICE_PCI_P2PDMA case. Link: http://lkml.kernel.org/r/20200107224558.2362728-4-jhubbard@nvidia.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-01Merge tag 'libnvdimm-for-5.5' of ↵Linus Torvalds15-306/+364
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "The highlight this cycle is continuing integration fixes for PowerPC and some resulting optimizations. Summary: - Updates to better support vmalloc space restrictions on PowerPC platforms. - Cleanups to move common sysfs attributes to core 'struct device_type' objects. - Export the 'target_node' attribute (the effective numa node if pmem is marked online) for regions and namespaces. - Miscellaneous fixups and optimizations" * tag 'libnvdimm-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (21 commits) MAINTAINERS: Remove Keith from NVDIMM maintainers libnvdimm: Export the target_node attribute for regions and namespaces dax: Add numa_node to the default device-dax attributes libnvdimm: Simplify root read-only definition for the 'resource' attribute dax: Simplify root read-only definition for the 'resource' attribute dax: Create a dax device_type libnvdimm: Move nvdimm_bus_attribute_group to device_type libnvdimm: Move nvdimm_attribute_group to device_type libnvdimm: Move nd_mapping_attribute_group to device_type libnvdimm: Move nd_region_attribute_group to device_type libnvdimm: Move nd_numa_attribute_group to device_type libnvdimm: Move nd_device_attribute_group to device_type libnvdimm: Move region attribute group definition libnvdimm: Move attribute groups to device type libnvdimm: Remove prototypes for nonexistent functions libnvdimm/btt: fix variable 'rc' set but not used libnvdimm/pmem: Delete include of nd-core.h libnvdimm/namespace: Differentiate between probe mapping and runtime mapping libnvdimm/pfn_dev: Don't clear device memmap area during generic namespace probe libnvdimm: Trivial comment fix ...
2019-12-01Merge tag 'compat-ioctl-5.5' of ↵Linus Torvalds1-2/+2
git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground Pull removal of most of fs/compat_ioctl.c from Arnd Bergmann: "As part of the cleanup of some remaining y2038 issues, I came to fs/compat_ioctl.c, which still has a couple of commands that need support for time64_t. In completely unrelated work, I spent time on cleaning up parts of this file in the past, moving things out into drivers instead. After Al Viro reviewed an earlier version of this series and did a lot more of that cleanup, I decided to try to completely eliminate the rest of it and move it all into drivers. This series incorporates some of Al's work and many patches of my own, but in the end stops short of actually removing the last part, which is the scsi ioctl handlers. I have patches for those as well, but they need more testing or possibly a rewrite" * tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (42 commits) scsi: sd: enable compat ioctls for sed-opal pktcdvd: add compat_ioctl handler compat_ioctl: move SG_GET_REQUEST_TABLE handling compat_ioctl: ppp: move simple commands into ppp_generic.c compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t compat_ioctl: move PPPIOCSCOMPRESS to ppp_generic compat_ioctl: unify copy-in of ppp filters tty: handle compat PPP ioctls compat_ioctl: move SIOCOUTQ out of compat_ioctl.c compat_ioctl: handle SIOCOUTQNSD af_unix: add compat_ioctl support compat_ioctl: reimplement SG_IO handling compat_ioctl: move WDIOC handling into wdt drivers fs: compat_ioctl: move FITRIM emulation into file systems gfs2: add compat_ioctl support compat_ioctl: remove unused convert_in_user macro compat_ioctl: remove last RAID handling code compat_ioctl: remove /dev/raw ioctl translation compat_ioctl: remove PCI ioctl translation compat_ioctl: remove joystick ioctl translation ...
2019-11-19libnvdimm: Export the target_node attribute for regions and namespacesDan Williams1-0/+29
Aneesh points out that some platforms may have "local" attached persistent memory and "remote" persistent memory that map to the same "online" node, or persistent memory devices with different performance properties. In this case 'numa_node' is identical for the two instances, but 'target_node' is differentiated so platform firmware can communicate distinct performance properties per range. Expose 'target_node' by default to allow for disambiguation of devices that share the same numa_map_to_online_node() result. Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157401274500.43284.2369509941678577768.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-19libnvdimm: Simplify root read-only definition for the 'resource' attributeDan Williams3-22/+7
Rather than update the permission in ->is_visible() set the permission directly at declaration time. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309905534.1582359.13927459228885931097.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19libnvdimm: Move nvdimm_bus_attribute_group to device_typeDan Williams5-16/+14
A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nvdimm_bus_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309903815.1582359.6418211876315050283.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19libnvdimm: Move nvdimm_attribute_group to device_typeDan Williams1-18/+18
A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nvdimm_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309903201.1582359.10966209746585062329.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19libnvdimm: Move nd_mapping_attribute_group to device_typeDan Williams1-2/+2
A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nd_mapping_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309902686.1582359.6749533709859492704.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19libnvdimm: Move nd_region_attribute_group to device_typeDan Williams3-14/+2
A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nd_region_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309902169.1582359.16828508538444551337.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19libnvdimm: Move nd_numa_attribute_group to device_typeDan Williams3-2/+3
A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nd_numa_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157401269537.43284.14411189404186877352.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-17libnvdimm: Move nd_device_attribute_group to device_typeDan Williams6-10/+22
A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nd_device_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. For regions this creates a new nd_region_attribute_groups[] added to the per-region device-type instances. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309901138.1582359.12909354140826530394.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-17libnvdimm: Move region attribute group definitionDan Williams1-104/+104
In preparation for moving region attributes from device attribute groups to the region device-type, reorder the declaration so that it can be referenced by the device-type definition without forward declarations. No functional changes are intended to result from this change. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309900624.1582359.6929998072035982264.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-17libnvdimm: Move attribute groups to device typeDan Williams5-76/+73
Statically initialize the attribute groups for each libnvdimm device_type. This is a preparation step for removing unnecessary exports of attributes that can be included in the device_type by default. Also take the opportunity to mark 'struct device_type' instances const. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309900111.1582359.2445687530383470348.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-17libnvdimm: Remove prototypes for nonexistent functionsAlastair D'Silva1-4/+0
These functions don't exist, so remove the prototypes for them. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Link: https://lore.kernel.org/r/20191025044721.16617-3-alastair@au1.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-17libnvdimm/btt: fix variable 'rc' set but not usedQian Cai1-4/+4
drivers/nvdimm/btt.c: In function 'btt_read_pg': drivers/nvdimm/btt.c:1264:8: warning: variable 'rc' set but not used [-Wunused-but-set-variable] int rc; ^~ Add a ratelimited message in case a storm of errors is encountered. Fixes: d9b83c756953 ("libnvdimm, btt: rework error clearing") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/1572530719-32161-1-git-send-email-cai@lca.pw Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-17libnvdimm/pmem: Delete include of nd-core.hDan Williams1-1/+0
The entire point of nd-core.h is to hide functionality that no leaf driver should touch. In fact, the commit that added it had no need to include it. Fixes: 06e8ccdab15f ("acpi: nfit: Add support for detect platform...") Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-14libnvdimm/namespace: Differentiate between probe mapping and runtime mappingAneesh Kumar K.V7-34/+81
The nvdimm core currently maps the full namespace to an ioremap range while probing the namespace mode. This can result in probe failures on architectures that have limited ioremap space. For example, with a large btt namespace that consumes most of I/O remap range, depending on the sequence of namespace initialization, the user can find a pfn namespace initialization failure due to unavailable I/O remap space which nvdimm core uses for temporary mapping. nvdimm core can avoid this failure by only mapping the reserved info block area to check for pfn superblock type and map the full namespace resource only before using the namespace. Given that personalities like BTT can be layered on top of any namespace type create a generic form of devm_nsio_enable (devm_namespace_enable) and use it inside the per-personality attach routines. Now devm_namespace_enable() is always paired with disable unless the mapping is going to be used for long term runtime access. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/20191017073308.32645-1-aneesh.kumar@linux.ibm.com [djbw: reworks to move devm_namespace_{en,dis}able into *attach helpers] Reported-by: kbuild test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20191031105741.102793-2-aneesh.kumar@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-14libnvdimm/pfn_dev: Don't clear device memmap area during generic namespace probeAneesh Kumar K.V1-1/+7
nvdimm core use nd_pfn_validate when looking for devdax or fsdax namespace. In this case device resources are allocated against nd_namespace_io dev. In-order to allow remap of range in nd_pfn_clear_memmap_error(), move the device memmap area clearing while initializing pfn namespace. With this device resource are allocated against nd_pfn and we can use nd_pfn->dev for remapping. This also avoids calling nd_pfn_clear_mmap_errors twice. Once while probing the namespace and second while initializing a pfn namespace. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/20191101032728.113001-1-aneesh.kumar@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>