summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2019-04-18thunderbolt: Add helper function to iterate from one port to anotherMika Westerberg2-0/+56
We need to be able to walk from one port to another when we are creating paths where there are multiple switches between two ports. For this reason introduce a new function tb_next_port_on_path(). Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Lukas Wunner <lukas@wunner.de>
2019-04-18thunderbolt: Assign remote for both ports in case of dual linkMika Westerberg5-52/+73
Currently the driver only assigns remote port for the primary port if in case of dual link. This makes things such as walking from one port to another more complex than necessary because the code needs to change from secondary to primary port if the path that is established is created using secondary links. In order to always assign both remote pointers we need to prevent the scanning code from following the secondary link. Failing to do that might cause problems as the same switch may be enumerated twice (or removed in case of unplug). Handle that properly by introducing a new function tb_port_has_remote() that returns true only for the primary port. We also update tb_is_upstream_port() to support both dual link ports, make it take const port pointer and move it below tb_upstream_port() to keep similar functions close. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Add functions for allocating and releasing HopIDsMika Westerberg3-3/+98
Each port has a separate path configuration space that is used for finding the next hop (switch) in the path. HopID is an index to this configuration space. HopIDs 0 - 7 are reserved by the protocol. In order to get next available HopID for each direction we provide two pairs of helper functions that can be used to allocate and release HopIDs for a given port. While there remove obsolete TODO comment. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Generalize tunnel creation functionalityMika Westerberg6-150/+235
To be able to tunnel non-PCIe traffic, separate tunnel functionality into generic and PCIe specific parts. Rename struct tb_pci_tunnel to tb_tunnel, and make it hold an array of paths instead of just two. Update all the tunneling functions to take this structure as parameter. We also move tb_pci_port_active() to switch.c (and rename it) where we will be keeping all port and switch related functions. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Rename tunnel_pci to tunnelMika Westerberg4-7/+7
In order to tunnel non-PCIe traffic as well rename tunnel_pci.[ch] to tunnel.[ch] to reflect this fact. No functional changes. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Cache adapter specific capability offset into struct portMika Westerberg4-10/+13
The adapter specific capability either is there or not if the port does not hold an adapter. Instead of always finding it on-demand we read the offset just once when the port is initialized. While there we update the struct port documentation to follow kernel-doc format. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Properly disable pathMika Westerberg2-5/+45
We need to wait until all buffers have been drained before the path can be considered disabled. Do this for every hop in a path. This adds another bit field to struct tb_regs_hop even if we are trying to get rid of them but we can clean them up another day. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Set sleep bit when suspending switchMika Westerberg4-4/+49
Thunderbolt 2 devices and beyond link controller needs to be notified when a switch is going to be suspended by setting bit 31 in LC_SX_CTRL register. Add this functionality to the software connection manager. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Configure lanes when switch is initializedMika Westerberg4-0/+136
Thunderbolt 2 devices and beyond need to have additional bits set in link controller specific registers. This includes two bits in LC_SX_CTRL that tell the link controller which lane is connected and whether it is upstream facing or not. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Move LC specific functionality into a separate fileMika Westerberg5-12/+37
We will be adding more link controller functionality in subsequent patches and it does not make sense to keep all that in switch.c, so separate LC functionality into its own file. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Add dummy read after port capability list walk on Light RidgeMika Westerberg1-0/+16
Light Ridge has an issue where reading the next capability pointer location in port config space the read data is not cleared. It is fine to read capabilities each after another so only thing we need to do is to make sure we issue dummy read after tb_port_find_cap() is finished to avoid the issue in next read. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Enable TMU access when accessing port space on legacy devicesMika Westerberg2-17/+62
Light Ridge and Eagle Ridge both need to have TMU access enabled before port space can be fully accessed so make sure it happens on those. This allows us to get rid of the offset quirk in tb_port_find_cap(). Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Do not allocate switch if depth is greater than 6Mika Westerberg3-9/+15
Maximum depth in Thunderbolt topology is 6 so make sure it is not possible to allocate switches that exceed the depth limit. While at it update tb_switch_alloc() to use upper/lower_32_bits() following tb_switch_alloc_safe_mode(). Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Take domain lock in switch sysfs attribute callbacksMika Westerberg2-28/+20
switch_lock was introduced because it allowed serialization of device authorization requests from userspace without need to take the big domain lock (tb->lock). This was fine because device authorization with ICM is just one command that is sent to the firmware. Now that we start to handle all tunneling in the driver switch_lock is not enough because we need to walk over the topology to establish paths. For this reason drop switch_lock from the driver completely in favour of big domain lock. There is one complication, though. If userspace is waiting for the lock in tb_switch_set_authorized(), it keeps the device_del() from removing the sysfs attribute because it waits for active users to release the attribute first which leads into following splat: INFO: task kworker/u8:3:73 blocked for more than 61 seconds. Tainted: G W 5.1.0-rc1+ #244 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u8:3 D12976 73 2 0x80000000 Workqueue: thunderbolt0 tb_handle_hotplug [thunderbolt] Call Trace: ? __schedule+0x2e5/0x740 ? _raw_spin_lock_irqsave+0x12/0x40 ? prepare_to_wait_event+0xc5/0x160 schedule+0x2d/0x80 __kernfs_remove.part.17+0x183/0x1f0 ? finish_wait+0x80/0x80 kernfs_remove_by_name_ns+0x4a/0x90 remove_files.isra.1+0x2b/0x60 sysfs_remove_group+0x38/0x80 sysfs_remove_groups+0x24/0x40 device_remove_attrs+0x3d/0x70 device_del+0x14c/0x360 device_unregister+0x15/0x50 tb_switch_remove+0x9e/0x1d0 [thunderbolt] tb_handle_hotplug+0x119/0x5a0 [thunderbolt] ? process_one_work+0x1b7/0x420 process_one_work+0x1b7/0x420 worker_thread+0x37/0x380 ? _raw_spin_unlock_irqrestore+0xf/0x30 ? process_one_work+0x420/0x420 kthread+0x118/0x130 ? kthread_create_on_node+0x60/0x60 ret_from_fork+0x35/0x40 We deal this by following what network stack did for some of their attributes and use mutex_trylock() with restart_syscall(). This makes userspace release the attribute allowing sysfs attribute removal to progress before the write is restarted and eventually fail when the attribute is removed. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Block reads and writes if switch is unpluggedMika Westerberg1-0/+8
If switch is already disconnected there is no point sending it commands and waiting for timeout. Instead in that case return error immediately. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18thunderbolt: Drop duplicated get_switch_at_route()Mika Westerberg4-26/+14
tb_switch_find_by_route() does the same already so use it instead and remove duplicated get_switch_at_route(). Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Lukas Wunner <lukas@wunner.de>
2019-04-18thunderbolt: Remove unused work field in struct tb_switchMika Westerberg1-2/+0
This field is not used anywhere so remove it. Reported-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-04-18net: thunderbolt: Unregister ThunderboltIP protocol handler when suspendingMika Westerberg1-0/+3
The XDomain protocol messages may start as soon as Thunderbolt control channel is started. This means that if the other host starts sending ThunderboltIP packets early enough they will be passed to the network driver which then gets confused because its resume hook is not called yet. Fix this by unregistering the ThunderboltIP protocol handler when suspending and registering it back on resume. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: David S. Miller <davem@davemloft.net>
2019-03-28thunderbolt: Fix to check the return value of kmemdupAditya Pakki1-0/+5
uuid in add_switch is allocted via kmemdup which can fail. The patch logs the error and cleans up the allocated memory for switch. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-03-28thunderbolt: property: Fix a missing check of kzallocKangjie Lu1-1/+6
No check is enforced for the return value of kzalloc, which may lead to NULL-pointer dereference. The patch fixes this issue. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-03-22thunderbolt: xdomain: Fix to check return value of kmemdupAditya Pakki1-6/+9
kmemdup can fail and return a NULL pointer. The patch modifies the signature of tb_xdp_schedule_request and passes the failure error upstream. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-03-22thunderbolt: Fix to check return value of ida_simple_getAditya Pakki1-1/+7
In enumerate_services, ida_simple_get on failure can return an error and leaks memory. The patch ensures that the dev_set_name is set on non failure cases, and releases memory during failure. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-03-22thunderbolt: Fix to check for kmemdup failureAditya Pakki1-6/+16
Memory allocated via kmemdup might fail and return a NULL pointer. This patch adds a check on the return value of kmemdup and passes the error upstream. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-03-20thunderbolt: Fix a missing check of kmemdupKangjie Lu1-0/+4
kmemdup may fail and return NULL. The fix adds a check and returns NULL in case it fails to avoid NULL pointer dereferecen. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-03-20thunderbolt: property: Fix a NULL pointer dereferenceKangjie Lu1-0/+5
In case kzalloc fails, the fix releases resources and returns -ENOMEM to avoid the NULL pointer dereference. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-03-17Linux 5.1-rc1v5.1-rc1Linus Torvalds1-2/+2
2019-03-17Merge tag 'kbuild-v5.1-2' of ↵Linus Torvalds57-156/+153
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - add more Build-Depends to Debian source package - prefix header search paths with $(srctree)/ - make modpost show verbose section mismatch warnings - avoid hard-coded CROSS_COMPILE for h8300 - fix regression for Debian make-kpkg command - add semantic patch to detect missing put_device() - fix some warnings of 'make deb-pkg' - optimize NOSTDINC_FLAGS evaluation - add warnings about redundant generic-y - clean up Makefiles and scripts * tag 'kbuild-v5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: remove stale lxdialog/.gitignore kbuild: force all architectures except um to include mandatory-y kbuild: warn redundant generic-y Revert "modsign: Abort modules_install when signing fails" kbuild: Make NOSTDINC_FLAGS a simply expanded variable kbuild: deb-pkg: avoid implicit effects coccinelle: semantic code search for missing put_device() kbuild: pkg: grep include/config/auto.conf instead of $KCONFIG_CONFIG kbuild: deb-pkg: introduce is_enabled and if_enabled_echo to builddeb kbuild: deb-pkg: add CONFIG_ prefix to kernel config options kbuild: add workaround for Debian make-kpkg kbuild: source include/config/auto.conf instead of ${KCONFIG_CONFIG} unicore32: simplify linker script generation for decompressor h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux- kbuild: move archive command to scripts/Makefile.lib modpost: always show verbose warning for section mismatch ia64: prefix header search path with $(srctree)/ libfdt: prefix header search paths with $(srctree)/ deb-pkg: generate correct build dependencies
2019-03-17Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds3-125/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Thomas Gleixner: "Two cleanup patches removing dead conditionals and unused code" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Remove unused __constant_c_x_memset() macro and inlines x86/asm: Remove dead __GNUC__ conditionals
2019-03-17Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Three fixes for the fallout from the TSX errata workaround: - Prevent memory corruption caused by a unchecked out of bound array index. - Two trivial fixes to address compiler warnings" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Make dev_attr_allow_tsx_force_abort static perf/x86: Fixup typo in stub functions perf/x86/intel: Fix memory corruption
2019-03-17Merge tag 'for-linus-5.1b-rc1b-tag' of ↵Linus Torvalds1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "A fix for a Xen bug introduced by David's series for excluding ballooned pages in vmcores" * tag 'for-linus-5.1b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/balloon: Fix mapping PG_offline pages to user space
2019-03-17Merge tag '9p-for-5.1' of git://github.com/martinetd/linuxLinus Torvalds7-32/+55
Pull 9p updates from Dominique Martinet: "Here is a 9p update for 5.1; there honestly hasn't been much. Two fixes (leak on invalid mount argument and possible deadlock on i_size update on 32bit smp) and a fall-through warning cleanup" * tag '9p-for-5.1' of git://github.com/martinetd/linux: 9p/net: fix memory leak in p9_client_create 9p: use inode->i_lock to protect i_size_write() under 32-bit 9p: mark expected switch fall-through
2019-03-17perf/x86/intel: Make dev_attr_allow_tsx_force_abort statickbuild test robot1-1/+1
Fixes: 400816f60c54 ("perf/x86/intel: Implement support for TSX Force Abort") Signed-off-by: kbuild test robot <lkp@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: kbuild-all@01.org Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190313184243.GA10820@lkp-sb-ep06
2019-03-17kconfig: remove stale lxdialog/.gitignoreMasahiro Yamada1-4/+0
When this .gitignore was added, lxdialog was an independent hostprogs-y. Now that all objects in lxdialog/ are directly linked to mconf, the lxdialog is no longer generated. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-03-17kbuild: force all architectures except um to include mandatory-yMasahiro Yamada29-47/+18
Currently, every arch/*/include/uapi/asm/Kbuild explicitly includes the common Kbuild.asm file. Factor out the duplicated include directives to scripts/Makefile.asm-generic so that no architecture would opt out of the mandatory-y mechanism. um is not forced to include mandatory-y since it is a very exceptional case which does not support UAPI. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-03-17kbuild: warn redundant generic-yMasahiro Yamada12-13/+6
The generic-y is redundant under the following condition: - arch has its own implementation - the same header is added to generated-y - the same header is added to mandatory-y If a redundant generic-y is found, the warning like follows is displayed: scripts/Makefile.asm-generic:20: redundant generic-y found in arch/arm/include/asm/Kbuild: timex.h I fixed up arch Kbuild files found by this. Suggested-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-03-17Revert "modsign: Abort modules_install when signing fails"Douglas Anderson1-1/+1
This reverts commit caf6fe91ddf62a96401e21e9b7a07227440f4185. The commit was fine but is no longer needed as of commit 3a2429e1faf4 ("kbuild: change if_changed_rule for multi-line recipe"). Let's go back to using ";" to be consistent. For some discussion, see: https://lkml.kernel.org/r/CAK7LNASde0Q9S5GKeQiWhArfER4S4wL1=R_FW8q0++_X3T5=hQ@mail.gmail.com Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-03-17kbuild: Make NOSTDINC_FLAGS a simply expanded variableDouglas Anderson1-1/+1
During a simple no-op (nothing changed) build I saw 39 invocations of the C compiler with the argument "-print-file-name=include". We don't need to call the C compiler 39 times for this--one time will suffice. Let's change NOSTDINC_FLAGS to a simply expanded variable to avoid this since there doesn't appear to be any reason it should be recursively expanded. On my build this shaved ~400 ms off my "no-op" build. Note that the recursive expansion seems to date back to the (really old) commit e8f5bdb02ce0 ("[PATCH] Makefile include path ordering"). It's a little unclear to me if the point of that patch was to switch the variable to be recursively expanded (which it did) or to avoid directly assigning to NOSTDINC_FLAGS (AKA to switch to +=) because someone else (out of tree?) was setting it. I presume later since if the only goal was to switch to recursive expansion the patch would have just removed the ":". Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-03-17kbuild: deb-pkg: avoid implicit effectsArseny Maslennikov1-1/+4
* The man page for dpkg-source(1) notes: > -b, --build directory [format-specific-parameters] > Build a source package (--build since dpkg 1.17.14). > <...> > > dpkg-source will build the source package with the first > format found in this ordered list: the format indicated > with the --format command line option, the format > indicated in debian/source/format, “1.0”. The fallback > to “1.0” is deprecated and will be removed at some point > in the future, you should always document the desired > source format in debian/source/format. See section > SOURCE PACKAGE FORMATS for an extensive description of > the various source package formats. Thus it would be more foolproof to explicitly use 1.0 (as we always did) than to rely on dpkg-source's defaults. * In a similar vein, debian/rules is not made executable by mkdebian, and dpkg-source warns about that but still silently fixes the file. Let's be explicit once again. Signed-off-by: Arseny Maslennikov <ar@cs.msu.ru> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-03-17coccinelle: semantic code search for missing put_device()Wen Yang1-0/+56
The of_find_device_by_node() takes a reference to the underlying device structure, we should release that reference. The implementation of this semantic code search is: In a function, for a local variable returned by calling of_find_device_by_node(), a, if it is released by a function such as put_device()/of_dev_put()/platform_device_put() after the last use, it is considered that there is no reference leak; b, if it is passed back to the caller via dev_get_drvdata()/platform_get_drvdata()/get_device(), etc., the reference will be released in other functions, and the current function also considers that there is no reference leak; c, for the rest of the situation, the current function should release the reference by calling put_device, this code search will report the corresponding error message. By using this semantic code search, we have found some object reference leaks, such as: commit 11907e9d3533 ("ASoC: fsl-asoc-card: fix object reference leaks in fsl_asoc_card_probe") commit a12085d13997 ("mtd: rawnand: atmel: fix possible object reference leak") commit 11493f26856a ("mtd: rawnand: jz4780: fix possible object reference leak") There are still dozens of reference leaks in the current kernel code. Further, for the case of b, the object returned to other functions may also have a reference leak, we will continue to develop other cocci scripts to further check the reference leak. Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Reviewed-by: Julia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: Markus Elfring <Markus.Elfring@web.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-03-16Merge tag 'pidfd-v5.1-rc1' of ↵Linus Torvalds11-6/+538
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull pidfd system call from Christian Brauner: "This introduces the ability to use file descriptors from /proc/<pid>/ as stable handles on struct pid. Even if a pid is recycled the handle will not change. For a start these fds can be used to send signals to the processes they refer to. With the ability to use /proc/<pid> fds as stable handles on struct pid we can fix a long-standing issue where after a process has exited its pid can be reused by another process. If a caller sends a signal to a reused pid it will end up signaling the wrong process. With this patchset we enable a variety of use cases. One obvious example is that we can now safely delegate an important part of process management - sending signals - to processes other than the parent of a given process by sending file descriptors around via scm rights and not fearing that the given process will have been recycled in the meantime. It also allows for easy testing whether a given process is still alive or not by sending signal 0 to a pidfd which is quite handy. There has been some interest in this feature e.g. from systems management (systemd, glibc) and container managers. I have requested and gotten comments from glibc to make sure that this syscall is suitable for their needs as well. In the future I expect it to take on most other pid-based signal syscalls. But such features are left for the future once they are needed. This has been sitting in linux-next for quite a while and has not caused any issues. It comes with selftests which verify basic functionality and also test that a recycled pid cannot be signaled via a pidfd. Jon has written about a prior version of this patchset. It should cover the basic functionality since not a lot has changed since then: https://lwn.net/Articles/773459/ The commit message for the syscall itself is extensively documenting the syscall, including it's functionality and extensibility" * tag 'pidfd-v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: selftests: add tests for pidfd_send_signal() signal: add pidfd_send_signal() syscall
2019-03-16Merge tag 'devdax-for-5.1' of ↵Linus Torvalds30-537/+1111
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull device-dax updates from Dan Williams: "New device-dax infrastructure to allow persistent memory and other "reserved" / performance differentiated memories, to be assigned to the core-mm as "System RAM". Some users want to use persistent memory as additional volatile memory. They are willing to cope with potential performance differences, for example between DRAM and 3D Xpoint, and want to use typical Linux memory management apis rather than a userspace memory allocator layered over an mmap() of a dax file. The administration model is to decide how much Persistent Memory (pmem) to use as System RAM, create a device-dax-mode namespace of that size, and then assign it to the core-mm. The rationale for device-dax is that it is a generic memory-mapping driver that can be layered over any "special purpose" memory, not just pmem. On subsequent boots udev rules can be used to restore the memory assignment. One implication of using pmem as RAM is that mlock() no longer keeps data off persistent media. For this reason it is recommended to enable NVDIMM Security (previously merged for 5.0) to encrypt pmem contents at rest. We considered making this recommendation an actively enforced requirement, but in the end decided to leave it as a distribution / administrator policy to allow for emulation and test environments that lack security capable NVDIMMs. Summary: - Replace the /sys/class/dax device model with /sys/bus/dax, and include a compat driver so distributions can opt-in to the new ABI. - Allow for an alternative driver for the device-dax address-range - Introduce the 'kmem' driver to hotplug / assign a device-dax address-range to the core-mm. - Arrange for the device-dax target-node to be onlined so that the newly added memory range can be uniquely referenced by numa apis" NOTE! I'm not entirely happy with the whole "PMEM as RAM" model because we currently have special - and very annoying rules in the kernel about accessing PMEM only with the "MC safe" accessors, because machine checks inside the regular repeat string copy functions can be fatal in some (not described) circumstances. And apparently the PMEM modules can cause that a lot more than regular RAM. The argument is that this happens because PMEM doesn't necessarily get scrubbed at boot like RAM does, but that is planned to be added for the user space tooling. Quoting Dan from another email: "The exposure can be reduced in the volatile-RAM case by scanning for and clearing errors before it is onlined as RAM. The userspace tooling for that can be in place before v5.1-final. There's also runtime notifications of errors via acpi_nfit_uc_error_notify() from background scrubbers on the DIMM devices. With that mechanism the kernel could proactively clear newly discovered poison in the volatile case, but that would be additional development more suitable for v5.2. I understand the concern, and the need to highlight this issue by tapping the brakes on feature development, but I don't see PMEM as RAM making the situation worse when the exposure is also there via DAX in the PMEM case. Volatile-RAM is arguably a safer use case since it's possible to repair pages where the persistent case needs active application coordination" * tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: device-dax: "Hotplug" persistent memory for use like normal RAM mm/resource: Let walk_system_ram_range() search child resources mm/memory-hotplug: Allow memory resources to be children mm/resource: Move HMM pr_debug() deeper into resource code mm/resource: Return real error codes from walk failures device-dax: Add a 'modalias' attribute to DAX 'bus' devices device-dax: Add a 'target_node' attribute device-dax: Auto-bind device after successful new_id acpi/nfit, device-dax: Identify differentiated memory with a unique numa-node device-dax: Add /sys/class/dax backwards compatibility device-dax: Add support for a dax override driver device-dax: Move resource pinning+mapping into the common driver device-dax: Introduce bus + driver model device-dax: Start defining a dax bus model device-dax: Remove multi-resource infrastructure device-dax: Kill dax_region base device-dax: Kill dax_region ida
2019-03-16Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds22-80/+210
Pull more SCSI updates from James Bottomley: "This is the final round of mostly small fixes and performance improvements to our initial submit. The main regression fix is the ia64 simscsi build failure which was missed in the serial number elimination conversion" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits) scsi: ia64: simscsi: use request tag instead of serial_number scsi: aacraid: Fix performance issue on logical drives scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup() scsi: libiscsi: Hold back_lock when calling iscsi_complete_task scsi: hisi_sas: Change SERDES_CFG init value to increase reliability of HiLink scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port scsi: hisi_sas: Set PHY linkrate when disconnected scsi: hisi_sas: print PHY RX errors count for later revision of v3 hw scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO scsi: hisi_sas: Change return variable type in phy_up_v3_hw() scsi: qla2xxx: check for kstrtol() failure scsi: lpfc: fix 32-bit format string warning scsi: lpfc: fix unused variable warning scsi: target: tcmu: Switch to bitmap_zalloc() scsi: libiscsi: fall back to sendmsg for slab pages scsi: qla2xxx: avoid printf format warning scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check scsi: ufs: hisi: fix ufs_hba_variant_ops passing scsi: qla2xxx: Fix panic in qla_dfs_tgt_counters_show ...
2019-03-16Merge tag 'for-5.1/block-post-20190315' of git://git.kernel.dk/linux-blockLinus Torvalds20-128/+259
Pull more block layer changes from Jens Axboe: "This is a collection of both stragglers, and fixes that came in after I finalized the initial pull. This contains: - An MD pull request from Song, with a few minor fixes - Set of NVMe patches via Christoph - Pull request from Konrad, with a few fixes for xen/blkback - pblk fix IO calculation fix (Javier) - Segment calculation fix for pass-through (Ming) - Fallthrough annotation for blkcg (Mathieu)" * tag 'for-5.1/block-post-20190315' of git://git.kernel.dk/linux-block: (25 commits) blkcg: annotate implicit fall through nvme-tcp: support C2HData with SUCCESS flag nvmet: ignore EOPNOTSUPP for discard nvme: add proper write zeroes setup for the multipath device nvme: add proper discard setup for the multipath device nvme: remove nvme_ns_config_oncs nvme: disable Write Zeroes for qemu controllers nvmet-fc: bring Disconnect into compliance with FC-NVME spec nvmet-fc: fix issues with targetport assoc_list list walking nvme-fc: reject reconnect if io queue count is reduced to zero nvme-fc: fix numa_node when dev is null nvme-fc: use nr_phys_segments to determine existence of sgl nvme-loop: init nvmet_ctrl fatal_err_work when allocate nvme: update comment to make the code easier to read nvme: put ns_head ref if namespace fails allocation nvme-trace: fix cdw10 buffer overrun nvme: don't warn on block content change effects nvme: add get-feature to admin cmds tracer md: Fix failed allocation of md_register_thread It's wrong to add len to sector_nr in raid10 reshape twice ...
2019-03-16Merge tag 'nfs-for-5.1-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds5-23/+21
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Bugfixes: - Fix an Oops in SUNRPC back channel tracepoints - Fix a SUNRPC client regression when handling oversized replies - Fix the minimal size for SUNRPC reply buffer allocation - rpc_decode_header() must always return a non-zero value on error - Fix a typo in pnfs_update_layout() Cleanup: - Remove redundant check for the reply length in call_decode()" * tag 'nfs-for-5.1-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Remove redundant check for the reply length in call_decode() SUNRPC: Handle the SYSTEM_ERR rpc error SUNRPC: rpc_decode_header() must always return a non-zero value on error SUNRPC: Use the ENOTCONN error on socket disconnect SUNRPC: Fix the minimal size for reply buffer allocation SUNRPC: Fix a client regression when handling oversized replies pNFS: Fix a typo in pnfs_update_layout fix null pointer deref in tracepoints in back channel
2019-03-16Merge tag 'powerpc-5.1-2' of ↵Linus Torvalds6-10/+22
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "One fix to prevent runtime allocation of 16GB pages when running in a VM (as opposed to bare metal), because it doesn't work. A small fix to our recently added KCOV support to exempt some more code from being instrumented. Plus a few minor build fixes, a small dead code removal and a defconfig update. Thanks to: Alexey Kardashevskiy, Aneesh Kumar K.V, Christophe Leroy, Jason Yan, Joel Stanley, Mahesh Salgaonkar, Mathieu Malaterre" * tag 'powerpc-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Include <asm/nmi.h> header file to fix a warning powerpc/powernv: Fix compile without CONFIG_TRACEPOINTS powerpc/mm: Disable kcov for SLB routines powerpc: remove dead code in head_fsl_booke.S powerpc/configs: Sync skiroot defconfig powerpc/hugetlb: Don't do runtime allocation of 16G pages in LPAR configuration
2019-03-16Merge branch 'work.mount' of ↵Linus Torvalds1-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount infrastructure fix from Al Viro: "Fixup for sysfs braino. Capabilities checks for sysfs mount do include those on netns, but only if CONFIG_NET_NS is enabled. Sorry, should've caught that earlier..." * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix sysfs_init_fs_context() in !CONFIG_NET_NS case
2019-03-16fix sysfs_init_fs_context() in !CONFIG_NET_NS caseAl Viro1-3/+5
Permission checks on current's netns should be done only when netns are enabled. Reported-by: Dominik Brodowski <linux@dominikbrodowski.net> Fixes: 23bf1b6be9c2 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-15Merge tag '5.1-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds17-362/+981
Pull more smb3 updates from Steve French: "Various tracing and debugging improvements, crediting fixes, some cleanup, and important fallocate fix (fixes three xfstests) and lock fix. Summary: - Various additional dynamic tracing tracepoints - Debugging improvements (including ability to query the server via SMB3 fsctl from userspace tools which can help with stats and debugging) - One minor performance improvement (root directory inode caching) - Crediting (SMB3 flow control) fixes - Some cleanup (docs and to mknod) - Important fixes: one to smb3 implementation of fallocate zero range (which fixes three xfstests) and a POSIX lock fix" * tag '5.1-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6: (22 commits) CIFS: fix POSIX lock leak and invalid ptr deref SMB3: Allow SMB3 FSCTL queries to be sent to server from tools cifs: fix incorrect handling of smb2_set_sparse() return in smb3_simple_falloc smb2: fix typo in definition of a few error flags CIFS: make mknod() an smb_version_op cifs: minor documentation updates cifs: remove unused value pointed out by Coverity SMB3: passthru query info doesn't check for SMB3 FSCTL passthru smb3: add dynamic tracepoints for simple fallocate and zero range cifs: fix smb3_zero_range so it can expand the file-size when required cifs: add SMB2_ioctl_init/free helpers to be used with compounding smb3: Add dynamic trace points for various compounded smb3 ops cifs: cache FILE_ALL_INFO for the shared root handle smb3: display volume serial number for shares in /proc/fs/cifs/DebugData cifs: simplify how we handle credits in compound_send_recv() smb3: add dynamic tracepoint for timeout waiting for credits smb3: display security information in /proc/fs/cifs/DebugData more accurately cifs: add a timeout argument to wait_for_free_credits cifs: prevent starvation in wait_for_free_credits for multi-credit requests cifs: wait_for_free_credits() make it possible to wait for >=1 credits ...
2019-03-15Merge branch 'for-linus-5.1-rc1' of ↵Linus Torvalds2-6/+3
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Richard Weinberger: "Bugfix for the UML block device driver" * 'for-linus-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: Fix for a possible OOPS in ubd initialization um: Remove duplicated include from vector_user.c
2019-03-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds93-1199/+2623
Pull KVM updates from Paolo Bonzini: "ARM: - some cleanups - direct physical timer assignment - cache sanitization for 32-bit guests s390: - interrupt cleanup - introduction of the Guest Information Block - preparation for processor subfunctions in cpu models PPC: - bug fixes and improvements, especially related to machine checks and protection keys x86: - many, many cleanups, including removing a bunch of MMU code for unnecessary optimizations - AVIC fixes Generic: - memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits) kvm: vmx: fix formatting of a comment KVM: doc: Document the life cycle of a VM and its resources MAINTAINERS: Add KVM selftests to existing KVM entry Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()" KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char() KVM: PPC: Fix compilation when KVM is not enabled KVM: Minor cleanups for kvm_main.c KVM: s390: add debug logging for cpu model subfunctions KVM: s390: implement subfunction processor calls arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2 KVM: arm/arm64: Remove unused timer variable KVM: PPC: Book3S: Improve KVM reference counting KVM: PPC: Book3S HV: Fix build failure without IOMMU support Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()" x86: kvmguest: use TSC clocksource if invariant TSC is exposed KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes() KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children ...