summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2019-03-15Merge branch 'akpm' (rest of patches from Andrew)Linus Torvalds2-72/+131
Merge the left-over patches from Andrew Morton. This merges the remaining two patches from Andrew's pile of "little bit more MM". I mulled it over, and we emailed back and forth with Josef, and he pointed out where I was wrong. Rule #51 of kernel maintenance: when somebody makes it clear that they know the code better than you did, stop arguing and just apply the damn patch. Add a third patch by me to add a comment for the case that I had thought was buggy and Josef corrected me on. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: filemap: add a comment about FAULT_FLAG_RETRY_NOWAIT behavior filemap: drop the mmap_sem for all blocking operations filemap: kill page_cache_read usage in filemap_fault
2019-03-15filemap: add a comment about FAULT_FLAG_RETRY_NOWAIT behaviorLinus Torvalds1-0/+5
I thought Josef Bacik's patch to drop the mmap_sem was buggy, because when looking at the error cases, there was one case where we returned VM_FAULT_RETRY without actually dropping the mmap_sem. Josef had to explain to me (using small words) that yes, that's actually what we're supposed to do, and his patch was correct. Which not only convinced me he knew what he was doing and I should stop arguing with him, but also that I should add a comment to the case I was confused about. Patiently-pointed-out-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-15filemap: drop the mmap_sem for all blocking operationsJosef Bacik1-19/+117
Currently we only drop the mmap_sem if there is contention on the page lock. The idea is that we issue readahead and then go to lock the page while it is under IO and we want to not hold the mmap_sem during the IO. The problem with this is the assumption that the readahead does anything. In the case that the box is under extreme memory or IO pressure we may end up not reading anything at all for readahead, which means we will end up reading in the page under the mmap_sem. Even if the readahead does something, it could get throttled because of io pressure on the system and the process is in a lower priority cgroup. Holding the mmap_sem while doing IO is problematic because it can cause system-wide priority inversions. Consider some large company that does a lot of web traffic. This large company has load balancing logic in it's core web server, cause some engineer thought this was a brilliant plan. This load balancing logic gets statistics from /proc about the system, which trip over processes mmap_sem for various reasons. Now the web server application is in a protected cgroup, but these other processes may not be, and if they are being throttled while their mmap_sem is held we'll stall, and cause this nice death spiral. Instead rework filemap fault path to drop the mmap sem at any point that we may do IO or block for an extended period of time. This includes while issuing readahead, locking the page, or needing to call ->readpage because readahead did not occur. Then once we have a fully uptodate page we can return with VM_FAULT_RETRY and come back again to find our nicely in-cache page that was gotten outside of the mmap_sem. This patch also adds a new helper for locking the page with the mmap_sem dropped. This doesn't make sense currently as generally speaking if the page is already locked it'll have been read in (unless there was an error) before it was unlocked. However a forthcoming patchset will change this with the ability to abort read-ahead bio's if necessary, making it more likely that we could contend for a page lock and still have a not uptodate page. This allows us to deal with this case by grabbing the lock and issuing the IO without the mmap_sem held, and then returning VM_FAULT_RETRY to come back around. [josef@toxicpanda.com: v6] Link: http://lkml.kernel.org/r/20181212152757.10017-1-josef@toxicpanda.com [kirill@shutemov.name: fix race in filemap_fault()] Link: http://lkml.kernel.org/r/20181228235106.okk3oastsnpxusxs@kshutemo-mobl1 [akpm@linux-foundation.org: coding style fixes] Link: http://lkml.kernel.org/r/20181211173801.29535-4-josef@toxicpanda.com Signed-off-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Tested-by: syzbot+b437b5a429d680cf2217@syzkaller.appspotmail.com Cc: Dave Chinner <david@fromorbit.com> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-15filemap: kill page_cache_read usage in filemap_faultJosef Bacik2-60/+16
Patch series "drop the mmap_sem when doing IO in the fault path", v6. Now that we have proper isolation in place with cgroups2 we have started going through and fixing the various priority inversions. Most are all gone now, but this one is sort of weird since it's not necessarily a priority inversion that happens within the kernel, but rather because of something userspace does. We have giant applications that we want to protect, and parts of these giant applications do things like watch the system state to determine how healthy the box is for load balancing and such. This involves running 'ps' or other such utilities. These utilities will often walk /proc/<pid>/whatever, and these files can sometimes need to down_read(&task->mmap_sem). Not usually a big deal, but we noticed when we are stress testing that sometimes our protected application has latency spikes trying to get the mmap_sem for tasks that are in lower priority cgroups. This is because any down_write() on a semaphore essentially turns it into a mutex, so even if we currently have it held for reading, any new readers will not be allowed on to keep from starving the writer. This is fine, except a lower priority task could be stuck doing IO because it has been throttled to the point that its IO is taking much longer than normal. But because a higher priority group depends on this completing it is now stuck behind lower priority work. In order to avoid this particular priority inversion we want to use the existing retry mechanism to stop from holding the mmap_sem at all if we are going to do IO. This already exists in the read case sort of, but needed to be extended for more than just grabbing the page lock. With io.latency we throttle at submit_bio() time, so the readahead stuff can block and even page_cache_read can block, so all these paths need to have the mmap_sem dropped. The other big thing is ->page_mkwrite. btrfs is particularly shitty here because we have to reserve space for the dirty page, which can be a very expensive operation. We use the same retry method as the read path, and simply cache the page and verify the page is still setup properly the next pass through ->page_mkwrite(). I've tested these patches with xfstests and there are no regressions. This patch (of 3): If we do not have a page at filemap_fault time we'll do this weird forced page_cache_read thing to populate the page, and then drop it again and loop around and find it. This makes for 2 ways we can read a page in filemap_fault, and it's not really needed. Instead add a FGP_FOR_MMAP flag so that pagecache_get_page() will return a unlocked page that's in pagecache. Then use the normal page locking and readpage logic already in filemap_fault. This simplifies the no page in page cache case significantly. [akpm@linux-foundation.org: fix comment text] [josef@toxicpanda.com: don't unlock null page in FGP_FOR_MMAP case] Link: http://lkml.kernel.org/r/20190312201742.22935-1-josef@toxicpanda.com Link: http://lkml.kernel.org/r/20181211173801.29535-2-josef@toxicpanda.com Signed-off-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Rik van Riel <riel@redhat.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds4-20/+63
Merge misc patches from Andrew Morton: - a little bit more MM - a few fixups [ The "little bit more MM" is actually just one of the three patches Andrew sent for mm/filemap.c, I'm still mulling over two more of them from Josef Bacik - Linus ] * emailed patches from Andrew Morton <akpm@linux-foundation.org>: include/linux/swap.h: use offsetof() instead of custom __swapoffset macro tools/testing/selftests/proc/proc-pid-vm.c: test with vsyscall in mind zram: default to lzo-rle instead of lzo filemap: pass vm_fault to the mmap ra helpers
2019-03-14include/linux/swap.h: use offsetof() instead of custom __swapoffset macroPi-Hsun Shih1-2/+2
Use offsetof() to calculate offset of a field to take advantage of compiler built-in version when possible, and avoid UBSAN warning when compiling with Clang: UBSAN: Undefined behaviour in mm/swapfile.c:3010:38 member access within null pointer of type 'union swap_header' CPU: 6 PID: 1833 Comm: swapon Tainted: G S 4.19.23 #43 Call trace: dump_backtrace+0x0/0x194 show_stack+0x20/0x2c __dump_stack+0x20/0x28 dump_stack+0x70/0x94 ubsan_epilogue+0x14/0x44 ubsan_type_mismatch_common+0xf4/0xfc __ubsan_handle_type_mismatch_v1+0x34/0x54 __se_sys_swapon+0x654/0x1084 __arm64_sys_swapon+0x1c/0x24 el0_svc_common+0xa8/0x150 el0_svc_compat_handler+0x2c/0x38 el0_svc_compat+0x8/0x18 Link: http://lkml.kernel.org/r/20190312081902.223764-1-pihsun@chromium.org Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-14tools/testing/selftests/proc/proc-pid-vm.c: test with vsyscall in mindAlexey Dobriyan1-3/+46
: selftests: proc: proc-pid-vm : ======================================== : proc-pid-vm: proc-pid-vm.c:277: main: Assertion `rv == strlen(buf0)' failed. : Aborted Because the vsyscall mapping is enabled. Read from vsyscall page to tell if vsyscall is being used. Link: http://lkml.kernel.org/r/20190307183204.GA11405@avx2 Link: http://lkml.kernel.org/r/20190219094722.GB28258@shao2-debian Fixes: 34aab6bec23e7e9 ("proc: test /proc/*/maps, smaps, smaps_rollup, statm") Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: kernel test robot <rong.a.chen@intel.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-14zram: default to lzo-rle instead of lzoDave Rodgman1-1/+1
lzo-rle gives higher performance and similar compression ratios to lzo. Link: http://lkml.kernel.org/r/20190205155944.16007-4-dave.rodgman@arm.com Signed-off-by: Dave Rodgman <dave.rodgman@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-14filemap: pass vm_fault to the mmap ra helpersJosef Bacik1-14/+14
All of the arguments to these functions come from the vmf. Cut down on the amount of arguments passed by simply passing in the vmf to these two helpers. Link: http://lkml.kernel.org/r/20181211173801.29535-3-josef@toxicpanda.com Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-14Merge tag 'acpi-5.1-rc1-2' of ↵Linus Torvalds5-24/+35
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI updates from Rafael Wysocki: "These fix a couple of issues and do some cleanups on top of the previous ACPI changes for 5.1-rc1. Specifics: - Fix a crash caused by unloading an SSDT overlay (Andy Shevchenko) - Prevent user space from getting confusing error values on failing ACPI sysfs accesses (Rafael Wysocki) - Simplify leaf node detection in the PPTT parsing code by using a new flag defined in ACPI 6.3 (Jeremy Linton) - Add missing "static" in some places in the ACPI configfs code (Andy Shevchenko) - Fix acpidbg tool path in the ACPI documentation (Flavio Suligoi)" * tag 'acpi-5.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: sysfs: Prevent get_status() from returning acpi_status ACPI / device_sysfs: Avoid OF modalias creation for removed device ACPI / configfs: Mark local data structures static ACPI / configfs: Mark local functions static ACPI: tables: Simplify PPTT leaf node detection ACPI: Documentation: Fix path for acpidbg tool
2019-03-14Merge tag 'pm-5.1-rc1-2' of ↵Linus Torvalds21-112/+131
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These are mostly fixes and cleanups on top of the previously merged power management material for 5.1-rc1 with one cpupower utility update that wasn't pushed earlier due to unfortunate timing. Specifics: - Fix registration of new cpuidle governors partially broken during the 5.0 development cycle by mistake (Rafael Wysocki). - Avoid integer overflows in the menu cpuidle governor by making it discard the overflowing data points upfront (Rafael Wysocki). - Fix minor mistake in the recent update of the iowait boost computation in the intel_pstate driver (Rafael Wysocki). - Drop incorrect __init annotation from one function in the pxa2xx cpufreq driver (Arnd Bergmann). - Fix the operating performance points (OPP) framework initialization for devices in multiple power domains if only one of them is scalable (Rajendra Nayak). - Fix mistake in dev_pm_opp_set_rate() which causes it to skip updating the performance state if the new frequency is the same as the old one (Viresh Kumar). - Rework the cancellation of wakeup source timers to avoid potential issues with it and do some cleanups unlocked by that change (Viresh Kumar, Rafael Wysocki). - Clean up the code computing the active/suspended time of devices in the PM-runtime framework after recent changes (Ulf Hansson). - Make the power management infrastructure code use pr_fmt() consistently (Joe Perches). - Clean up the generic power domains (genpd) framework somewhat (Aisheng Dong). - Improve kerneldoc comments for two functions in the cpufreq core (Rafael Wysocki). - Fix typo in a PM QoS file description comment (Aisheng Dong). - Update the handling of CPU boost frequencies in the cpupower utility (Abhishek Goel)" * tag 'pm-5.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpuidle: governor: Add new governors to cpuidle_governors again cpufreq: intel_pstate: Fix up iowait_boost computation PM / OPP: Update performance state when freq == old_freq PM / wakeup: Drop wakeup_source_drop() PM / wakeup: Rework wakeup source timer cancellation PM / domains: Remove one unnecessary blank line PM / Domains: Return early for all errors in _genpd_power_off() PM / Domains: Improve warn for multiple states but no governor OPP: Fix handling of multiple power domains PM / QoS: Fix typo in file description cpufreq: pxa2xx: remove incorrect __init annotation PM-runtime: Call pm_runtime_active|suspended_time() from sysfs PM-runtime: Consolidate code to get active/suspended time PM: Add and use pr_fmt() cpufreq: Improve kerneldoc comments for cpufreq_cpu_get/put() cpuidle: menu: Avoid overflows when computing variance tools/power/cpupower: Display boost frequency separately
2019-03-14Merge tag 'microblaze-v5.1-rc1' of git://git.monstr.eu/linux-2.6-microblazeLinus Torvalds1-11/+2
Pull Microblaze update from Michal Simek: "Simplify debugfs initialization" * tag 'microblaze-v5.1-rc1' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: no need to check return value of debugfs_create functions
2019-03-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds46-213/+427
Pull networking fixes from David Miller: "More fixes in the queue: 1) Netfilter nat can erroneously register the device notifier twice, fix from Florian Westphal. 2) Use after free in nf_tables, from Pablo Neira Ayuso. 3) Parallel update of steering rule fix in mlx5 river, from Eli Britstein. 4) RX processing panic in lan743x, fix from Bryan Whitehead. 5) Use before initialization of TCP_SKB_CB, fix from Christoph Paasch. 6) Fix locking in SRIOV mode of mlx4 driver, from Jack Morgenstein. 7) Fix TX stalls in lan743x due to mishandling of interrupt ACKing modes, from Bryan Whitehead. 8) Fix infoleak in l2tp_ip6_recvmsg(), from Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits) pptp: dst_release sk_dst_cache in pptp_sock_destruct MAINTAINERS: GENET & SYSTEMPORT: Add internal Broadcom list l2tp: fix infoleak in l2tp_ip6_recvmsg() net/tls: Inform user space about send buffer availability net_sched: return correct value for *notify* functions lan743x: Fix TX Stall Issue net/mlx4_core: Fix qp mtt size calculation net/mlx4_core: Fix locking in SRIOV mode when switching between events and polling net/mlx4_core: Fix reset flow when in command polling mode mlxsw: minimal: Initialize base_mac mlxsw: core: Prevent duplication during QSFP module initialization net: dwmac-sun8i: fix a missing check of of_get_phy_mode net: sh_eth: fix a missing check of of_get_phy_mode net: 8390: fix potential NULL pointer dereferences net: fujitsu: fix a potential NULL pointer dereference net: qlogic: fix a potential NULL pointer dereference isdn: hfcpci: fix potential NULL pointer dereference Documentation: devicetree: add a new optional property for port mac address net: rocker: fix a potential NULL pointer dereference net: qlge: fix a potential NULL pointer dereference ...
2019-03-14Merge tag 'dmaengine-5.1-rc1' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds61-654/+2592
Pull dmaengine updates from Vinod Koul: - dmatest updates for modularizing common struct and code - remove SG support for VDMA xilinx IP and updates to driver - Update to dw driver to support Intel iDMA controllers multi-block support - tegra updates for proper reporting of residue - Add Snow Ridge ioatdma device id and support for IOATDMA v3.4 - struct_size() usage and useless LIST_HEAD cleanups in subsystem. - qDMA controller driver for Layerscape SoCs - stm32-dma PM Runtime support - And usual updates to imx-sdma, sprd, Documentation, fsl-edma, bcm2835, qcom_hidma etc * tag 'dmaengine-5.1-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (81 commits) dmaengine: imx-sdma: fix consistent dma test failures dmaengine: imx-sdma: add a test for imx8mq multi sdma devices dmaengine: imx-sdma: add clock ratio 1:1 check dmaengine: dmatest: move test data alloc & free into functions dmaengine: dmatest: add short-hand `buf_size` var in dmatest_func() dmaengine: dmatest: wrap src & dst data into a struct dmaengine: ioatdma: support latency tolerance report (LTR) for v3.4 dmaengine: ioatdma: add descriptor pre-fetch support for v3.4 dmaengine: ioatdma: disable DCA enabling on IOATDMA v3.4 dmaengine: ioatdma: Add Snow Ridge ioatdma device id dmaengine: sprd: Change channel id to slave id for DMA cell specifier dt-bindings: dmaengine: sprd: Change channel id to slave id for DMA cell specifier dmaengine: mv_xor: Use correct device for DMA API Documentation :dmaengine: clarify DMA desc. pointer after submission Documentation: dmaengine: fix dmatest.rst warning dmaengine: k3dma: Add support for dma-channel-mask dmaengine: k3dma: Delete axi_config dmaengine: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware Documentation: bindings: dma: Add binding for dma-channel-mask Documentation: bindings: k3dma: Extend the k3dma driver binding to support hisi-asp ...
2019-03-14Merge tag 'rproc-v5.1' of git://github.com/andersson/remoteprocLinus Torvalds15-120/+706
Pull remoteproc updates from Bjorn Andersson: "This contains the last patches in Loic's remoteproc resource table handling changes, a number of updates to documentation, support for invoking the crash handler (for testing purposes), a fix for the handling of virtio devices during recovery, performance state votes in Qualcomm modem driver, support for specifying board specific firmware path for Qualcomm modem driver and improved support for graceful shutdown of Qualcomm remoteprocs" * tag 'rproc-v5.1' of git://github.com/andersson/remoteproc: (33 commits) remoteproc: fix for "dma-mapping: remove the DMA_MEMORY_EXCLUSIVE flag" remoteproc: fix rproc_check_carveout_da() returned error and comments remoteproc: fix trace buffer va initialization remoteproc: fix rproc_alloc_carveout() for rproc with iommu domain remoteproc: add warning on resource table cast remoteproc: fix rproc_alloc_carveout() bad variable cast remoteproc: fix rproc_da_to_va in case of unallocated carveout remoteproc: correct rproc_mem_entry_init() comments remoteproc: fix recovery procedure rpmsg: virtio: change header file sort style rpmsg: virtio: allocate buffer from parent remoteproc: st: add reserved memory support remoteproc: create vdev subdevice with specific dma memory pool remoteproc: q6v5_adsp: Remove voting for lpass_aon clock dt-binding: remoteproc: Remove lpass_aon clock from adsp pil clock list remoteproc: q6v5-mss: Active powerdomain for SDM845 remoteproc: q6v5-mss: Vote for rpmh power domains remoteproc: qcom: Add support for parsing fw dt bindings remoteproc: qcom_q6v5: don't auto boot remote processor remoteproc: qcom: Wait for shutdown-ack/ind on sysmon shutdown ...
2019-03-14Merge tag 'clk-for-linus' of ↵Linus Torvalds181-1742/+9562
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk subsystem updates from Stephen Boyd: "We have a fairly balanced mix of clk driver updates and clk framework updates this time around. It's the usual pile of new drivers for new hardware out there and the normal small fixes and updates, but then we have some core framework changes too. In the core framework, we introduce support for a clk_get_optional() API to get clks that may not always be populated and a way to devm manage clkdev lookups registered by provider drivers. We also do some refactoring to simplify the interface between clkdev and the common clk framework so we can reuse the DT parsing and clk_get() path in provider drivers in the future. This work will continue in the next few cycles while we convert how providers specify clk parents. On the driver side, the biggest part of the dirstat is the Amlogic clk driver that got support for the G12A SoC. It dominates with almost half the overall diff, while the second largest part of the diff is in the i.MX clk driver that gained support for imx8mm SoCs. After that, we have the Actions Semiconductor and Qualcomm drivers rounding out the big part of the dirstat because they both got new hardware support for SoCs. The rest is just various updates and non-critical fixes for existing drivers. Core: - Convert a few clk bindings to JSON schema format - Add a {devm_}clk_get_optional() API - Add devm_clk_hw_register_clkdev() API to manage clkdev lookups - Start rewriting clk parent registration and supporting device links by moving around code that supports clk_get() and DT parsing of the 'clocks' property New Drivers: - Add Qualcomm MSM8998 RPM managed clks - IPA clk support on Qualcomm RPMh clk controllers - Actions Semi S500 SoC clk support - Support for fixed rate clks populated from an MMIO register - Add RPC (QSPI/HyperFLASH) clocks on Renesas R-Car V3H - Add TMU (timer) clocks on Renesas RZ/G2E - Add Amlogic G12A Always-On Clock Controller - Add 32k clock generation for Amlogic AXG - Add support for the Mali GPU clocks on Amlogic Meson8 - Add Amlogic G12A EE clock controller driver - Add missing CANFD clocks on Renesas RZ/G2M and RZ/G2E - Add i.MX8MM SoC clk driver support Removed Drivers: - Remove clps711x driver as the board support is gone Updates: - 3rd ECO fix for Mediatek MT2712 SoCs - Updates for Qualcomm MSM8998 GCC clks - Random static analysis fixes for clk drivers - Support for sleeping gpios in the clk-gpio type - Minor fixes for STM32MP1 clk driver (parents, critical flag, etc.) - Split LCDC into two clks on the Marvell MMP2 SoC - Various DT of_node refcount fixes - Get rid of CLK_IS_BASIC from TI code (yay!) - TI Autoidle clk support - Fix Amlogic Meson8 APB clock ID name - Claim input clocks through DT for Amlogic AXG and GXBB - Correct the DU (display unit) parent clock on Renesas RZ/G2E - Exynos5433 IMEM CMU crypto clk support (SlimSS) - Fix for the PLL-MIPI on the Allwinner A23 - Fix Rockchip rk3328 PLL rate calculation - Add SET_RATE_PARENT flag on display clk of Rockhip rk3066 - i.MX SCU clk driver clk_set_parent() and cpufreq support" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (150 commits) dt-bindings: clock: imx8mq: Fix numbering overlaps and gaps clk: ti: clkctrl: Fix clkdm_name regression for TI_CLK_CLKCTRL_COMPAT clk: fixup default index for of_clk_get_by_name() clk: Move of_clk_*() APIs into clk.c from clkdev.c clk: Inform the core about consumer devices clk: Introduce of_clk_get_hw_from_clkspec() clk: core: clarify the check for runtime PM clk: Combine __clk_get() and __clk_create_clk() clk: imx8mq: add GPIO clocks to clock tree clk: mediatek: correct cpu clock name for MT8173 SoC clk: imx: Refactor entire sccg pll clk clk: imx: scu: add cpu frequency scaling support clk: mediatek: Mark bus and DRAM related clocks as critical clk: mediatek: Add flags to mtk_gate clk: mediatek: Add MUX_FLAGS macro clk: qcom: gcc-sdm845: Define parent of PCIe PIPE clocks clk: ingenic: Remove set but not used variable 'enable' clk: at91: programmable: remove unneeded register read clk: mediatek: using CLK_MUX_ROUND_CLOSEST for the clock of dpi1_sel clk: mediatek: add MUX_GATE_FLAGS_2 ...
2019-03-14Merge branches 'acpi-tables', 'acpi-debug', 'acpi-doc' and 'acpi-misc'Rafael J. Wysocki4-23/+30
* acpi-tables: ACPI: tables: Simplify PPTT leaf node detection * acpi-debug: ACPI: sysfs: Prevent get_status() from returning acpi_status * acpi-doc: ACPI: Documentation: Fix path for acpidbg tool * acpi-misc: ACPI / configfs: Mark local data structures static ACPI / configfs: Mark local functions static
2019-03-14Merge branches 'pm-opp' and 'pm-tools'Rafael J. Wysocki5-30/+65
* pm-opp: PM / OPP: Update performance state when freq == old_freq OPP: Fix handling of multiple power domains * pm-tools: tools/power/cpupower: Display boost frequency separately
2019-03-14Merge branch 'pm-domains'Rafael J. Wysocki2-5/+4
* pm-domains: PM / domains: Remove one unnecessary blank line PM / Domains: Return early for all errors in _genpd_power_off() PM / Domains: Improve warn for multiple states but no governor
2019-03-14Merge branches 'pm-cpuidle' and 'pm-cpufreq'Rafael J. Wysocki5-19/+14
* pm-cpuidle: cpuidle: governor: Add new governors to cpuidle_governors again cpuidle: menu: Avoid overflows when computing variance * pm-cpufreq: cpufreq: intel_pstate: Fix up iowait_boost computation cpufreq: pxa2xx: remove incorrect __init annotation cpufreq: Improve kerneldoc comments for cpufreq_cpu_get/put()
2019-03-14Merge branches 'pm-core', 'pm-sleep' and 'pm-qos'Rafael J. Wysocki7-42/+25
* pm-core: PM-runtime: Call pm_runtime_active|suspended_time() from sysfs PM-runtime: Consolidate code to get active/suspended time * pm-sleep: PM / wakeup: Drop wakeup_source_drop() PM / wakeup: Rework wakeup source timer cancellation * pm-qos: PM / QoS: Fix typo in file description
2019-03-13pptp: dst_release sk_dst_cache in pptp_sock_destructXin Long1-0/+1
sk_setup_caps() is called to set sk->sk_dst_cache in pptp_connect, so we have to dst_release(sk->sk_dst_cache) in pptp_sock_destruct, otherwise, the dst refcnt will leak. It can be reproduced by this syz log: r1 = socket$pptp(0x18, 0x1, 0x2) bind$pptp(r1, &(0x7f0000000100)={0x18, 0x2, {0x0, @local}}, 0x1e) connect$pptp(r1, &(0x7f0000000000)={0x18, 0x2, {0x3, @remote}}, 0x1e) Consecutive dmesg warnings will occur: unregister_netdevice: waiting for lo to become free. Usage count = 1 v1->v2: - use rcu_dereference_protected() instead of rcu_dereference_check(), as suggested by Eric. Fixes: 00959ade36ac ("PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-13MAINTAINERS: GENET & SYSTEMPORT: Add internal Broadcom listFlorian Fainelli1-0/+2
There is a patchwork instance behind bcm-kernel-feedback-list that is helpful to track submissions, add this list for the Broadcom GENET and SYSTEMPORT drivers. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-13l2tp: fix infoleak in l2tp_ip6_recvmsg()Eric Dumazet1-3/+1
Back in 2013 Hannes took care of most of such leaks in commit bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") But the bug in l2tp_ip6_recvmsg() has not been fixed. syzbot report : BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 CPU: 1 PID: 10996 Comm: syz-executor362 Not tainted 5.0.0+ #11 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:600 kmsan_internal_check_memory+0x9f4/0xb10 mm/kmsan/kmsan.c:694 kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601 _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 copy_to_user include/linux/uaccess.h:174 [inline] move_addr_to_user+0x311/0x570 net/socket.c:227 ___sys_recvmsg+0xb65/0x1310 net/socket.c:2283 do_recvmmsg+0x646/0x10c0 net/socket.c:2390 __sys_recvmmsg net/socket.c:2469 [inline] __do_sys_recvmmsg net/socket.c:2492 [inline] __se_sys_recvmmsg+0x1d1/0x350 net/socket.c:2485 __x64_sys_recvmmsg+0x62/0x80 net/socket.c:2485 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x445819 Code: e8 6c b6 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 12 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f64453eddb8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00000000006dac28 RCX: 0000000000445819 RDX: 0000000000000005 RSI: 0000000020002f80 RDI: 0000000000000003 RBP: 00000000006dac20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dac2c R13: 00007ffeba8f87af R14: 00007f64453ee9c0 R15: 20c49ba5e353f7cf Local variable description: ----addr@___sys_recvmsg Variable was created at: ___sys_recvmsg+0xf6/0x1310 net/socket.c:2244 do_recvmmsg+0x646/0x10c0 net/socket.c:2390 Bytes 0-31 of 32 are uninitialized Memory access of size 32 starts at ffff8880ae62fbb0 Data copied to user address 0000000020000000 Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-13net/tls: Inform user space about send buffer availabilityVakul Garg2-4/+2
A previous fix ("tls: Fix write space handling") assumed that user space application gets informed about the socket send buffer availability when tls_push_sg() gets called. Inside tls_push_sg(), in case do_tcp_sendpages() returns 0, the function returns without calling ctx->sk_write_space. Further, the new function tls_sw_write_space() did not invoke ctx->sk_write_space. This leads to situation that user space application encounters a lockup always waiting for socket send buffer to become available. Rather than call ctx->sk_write_space from tls_push_sg(), it should be called from tls_write_space. So whenever tcp stack invokes sk->sk_write_space after freeing socket send buffer, we always declare the same to user space by the way of invoking ctx->sk_write_space. Fixes: 7463d3a2db0ef ("tls: Fix write space handling") Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-13net_sched: return correct value for *notify* functionsZhike Wang2-13/+34
It is confusing to directly use return value of netlink_send()/ netlink_unicast() as the return value of *notify*, as it may be not error at all. Example: in tc_del_tfilter(), after calling tfilter_del_notify(), it will goto errout if (err). However, the netlink_send()/netlink_unicast() will return positive value even for successful case. So it may not call tcf_chain_tp_remove() and so on to clean up the resource, as a result, resource is leaked. It may be easier to only check the return value of tfilter_del_nofiy(), but it is more clean to correct all related functions. Co-developed-by: Zengmo Gao <gaozengmo@jd.com> Signed-off-by: Zhike Wang <wangzhike@jd.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-13lan743x: Fix TX Stall IssueBryan Whitehead1-8/+1
It has been observed that tx queue may stall while downloading from certain web sites (example www.speedtest.net) The cause has been tracked down to a corner case where the tx interrupt vector was disabled automatically, but was not re enabled later. The lan743x has two mechanisms to enable/disable individual interrupts. Interrupts can be enabled/disabled by individual source, and they can also be enabled/disabled by individual vector which has been mapped to the source. Both must be enabled for interrupts to work properly. The TX code path, primarily uses the interrupt enable/disable of the TX source bit, while leaving the vector enabled all the time. However, while investigating this issue it was noticed that the driver requested the use of the vector auto clear feature. The test above revealed a case where the vector enable was cleared unintentionally. This patch fixes the issue by deleting the lines that request the vector auto clear feature to be used. Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-13Merge tag 'selinux-pr-20190312' of ↵Linus Torvalds1-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fixes from Paul Moore: "Two small fixes for SELinux in v5.1: one adds a buffer length check to the SELinux SCTP code, the other ensures that the SELinux labeling for a NFS mount is not disabled if the filesystem is mounted twice" * tag 'selinux-pr-20190312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: security/selinux: fix SECURITY_LSM_NATIVE_LABELS on reused superblock selinux: add the missing walk_size + len check in selinux_sctp_bind_connect
2019-03-13Merge tag 'apparmor-pr-2019-03-12' of ↵Linus Torvalds2-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull apparmor fixes from John Johansen: - fix double when failing to unpack secmark rules in policy - fix leak of dentry when profile is removed * tag 'apparmor-pr-2019-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: apparmor: fix double free when unpack of secmark rules fails apparmor: delete the dentry in aafs_remove() to avoid a leak apparmor: Fix warning about unused function apparmor_ipv6_postroute
2019-03-13Merge tag 'kconfig-v5.1' of ↵Linus Torvalds7-14/+44
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kconfig updates from Masahiro Yamada: - rename lexer and parse files - fix 'Save as' menu of xconfig * tag 'kconfig-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: fix 'Save As' menu of xconfig kconfig: rename zconf.y to parser.y kconfig: rename zconf.l to lexer.l
2019-03-13Merge tag 'pwm/for-5.1-rc1' of ↵Linus Torvalds13-300/+474
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "The changes for this cycle are across the board. The bulk of it is cleanups, but there's also new device support in some drivers as well as more conversions to the atomic API" * tag 'pwm/for-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (24 commits) pwm: atmel: Remove useless symbolic definitions pwm: bcm-kona: Update macros to remove braces around numbers pwm: imx27: Only enable the clocks once in .get_state() pwm: rcar: Improve calculation of divider pwm: rcar: Remove legacy APIs pwm: rcar: Use "atomic" API on rcar_pwm_resume() pwm: rcar: Add support "atomic" API pwm: atmel: Add support for SAM9X60's PWM controller pwm: atmel: Add PWM binding for SAM9X60 pwm: atmel: Rename objects of type atmel_pwm_data pwm: atmel: Add support for controllers with 32 bit counters pwm: atmel: Add struct atmel_pwm_data pwm: Add MediaTek MT8183 display PWM driver support pwm: hibvt: Add hi3559v100 support dt-bindings: pwm: hibvt: Add hi3559v100 support pwm: hibvt: Use individual struct per of-data pwm: imx: Signedness bug in imx_pwm_get_state() pwm: imx: Split into two drivers pwm: imx: Don't print an error on -EPROBE_DEFER pwm: imx: Set driver data earlier simplifying the end of ->probe() ...
2019-03-13Merge tag 'mailbox-v5.1' of ↵Linus Torvalds9-18/+903
git://git.linaro.org/landing-teams/working/fujitsu/integration Pull mailbox updates from Jassi Brar: - mailbox-test: support multiple controller instances - misc cleanup: IMX, STM32 and Tegra - new driver: ZynqMP IPI * tag 'mailbox-v5.1' of git://git.linaro.org/landing-teams/working/fujitsu/integration: mailbox: imx: keep MU irq working during suspend/resume dt-bindings: mailbox: Add Xilinx IPI Mailbox mailbox: ZynqMP IPI mailbox controller mailbox: stm32-ipcc: remove useless device_init_wakeup call mailbox: stm32-ipcc: do not enable wakeup source by default mailbox: mailbox-test: fix null pointer if no mmio mailbox: mailbox-test: fix debugfs in multi-instances mailbox: tegra-hsp: mark suspend function as __maybe_unused
2019-03-13Merge branch 'linus' of ↵Linus Torvalds6-28/+51
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a bug in the newly added Exynos5433 AES code as well as an old one in the caam driver" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: caam - add missing put_device() call crypto: s5p-sss - fix AES support for Exynos5433
2019-03-13Merge tag 'libnvdimm-for-5.1' of ↵Linus Torvalds15-113/+273
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "The bulk of this has been in -next since before the merge window opened, with no known collisions / issues reported. The only detail worth noting, outside the summary below, is that the "libnvdimm-start-pad" topic has been truncated to just cleanups and small fixes. The full topic branch would have doubled down on hacks around the "section alignment" limitation of the core-mm, instead effort is now being spent to address that root issue in the memory hotplug implementation for v5.2. - Fix nfit-bus command submission regression - Support retrieval of short-ARS results if the ARS state is "requires continuation", and even if the "no_init_ars" module parameter is specified - Allow busy-polling of the kernel ARS state by allowing root to reset the exponential back-off timer - Filter potentially stale ARS results by tracking query-ARS relative to the previous start-ARS - Enhance dax_device alignment checks - Add support for the Hyper-V family of device-specific-methods (DSMs) - Add several fixes and workarounds for Hyper-V compatibility - Fix support to cache the dirty-shutdown-count at init" * tag 'libnvdimm-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (25 commits) libnvdimm/namespace: Clean up holder_class_store() libnvdimm/of_pmem: Fix platform_no_drv_owner.cocci warnings acpi/nfit: Update NFIT flags error message libnvdimm/btt: Fix LBA masking during 'free list' population libnvdimm/btt: Remove unnecessary code in btt_freelist_init libnvdimm/pfn: Remove dax_label_reserve dax: Check the end of the block-device capacity with dax_direct_access() nfit/ars: Avoid stale ARS results nfit/ars: Allow root to busy-poll the ARS state machine nfit/ars: Introduce scrub_flags nfit/ars: Remove ars_start_flags nfit/ars: Attempt short-ARS even in the no_init_ars case nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot acpi/nfit: Require opt-in for read-only label configurations libnvdimm/pmem: Honor force_raw for legacy pmem regions libnvdimm/pfn: Account for PAGE_SIZE > info-block-size in nd_pfn_init() libnvdimm: Fix altmap reservation size calculation libnvdimm, pfn: Fix over-trim in trim_pfn_device() acpi/nfit: Fix bus command validation libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family ...
2019-03-13Merge tag 'fsdax-for-5.1' of ↵Linus Torvalds1-14/+11
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull filesystem-dax updates from Dan Williams: - Fix handling of PMD-sized entries in the Xarray that lead to a crash scenario - Miscellaneous cleanups and small fixes * tag 'fsdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: Flush partial PMDs correctly fs/dax: NIT fix comment regarding start/end vs range fs/dax: Convert to use vmf_error()
2019-03-13Merge tag 'upstream-5.1-rc1' of ↵Linus Torvalds5-7/+211
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBI and UBIFS updates from Richard Weinberger: - A new interface for UBI to deal better with read disturb - Reject unsupported ioctl flags in UBIFS (xfstests found it) * tag 'upstream-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: ubi: wl: Silence uninitialized variable warning ubifs: Reject unsupported ioctl flags explicitly ubi: Expose the bitrot interface ubi: Introduce in_pq()
2019-03-12remoteproc: fix for "dma-mapping: remove the DMA_MEMORY_EXCLUSIVE flag"Stephen Rothwell1-2/+1
The commit 82c5de0ab8db ("dma-mapping: remove the DMA_MEMORY_EXCLUSIVE flag") removed the "flags" parameter for dma_declare_coherent_memory(). Remove the parameter from the call in rproc_add_virtio_dev(). Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> [bjorn: Extended commit message] Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2019-03-12cpuidle: governor: Add new governors to cpuidle_governors againRafael J. Wysocki1-0/+1
After commit 61cb5758d3c4 ("cpuidle: Add cpuidle.governor= command line parameter") new cpuidle governors are not added to the list of available governors, so governor selection via sysfs doesn't work as expected (even though it is rarely used anyway). Fix that by making cpuidle_register_governor() add new governors to cpuidle_governors again. Fixes: 61cb5758d3c4 ("cpuidle: Add cpuidle.governor= command line parameter") Reported-by: Kees Cook <keescook@chromium.org> Cc: 5.0+ <stable@vger.kernel.org> # 5.0+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-03-12Merge tag 'nfsd-5.1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds11-58/+65
Pull NFS server updates from Bruce Fields: "Miscellaneous NFS server fixes. Probably the most visible bug is one that could artificially limit NFSv4.1 performance by limiting the number of oustanding rpcs from a single client. Neil Brown also gets a special mention for fixing a 14.5-year-old memory-corruption bug in the encoding of NFSv3 readdir responses" * tag 'nfsd-5.1' of git://linux-nfs.org/~bfields/linux: nfsd: allow nfsv3 readdir request to be larger. nfsd: fix wrong check in write_v4_end_grace() nfsd: fix memory corruption caused by readdir nfsd: fix performance-limiting session calculation svcrpc: fix UDP on servers with lots of threads svcrdma: Remove syslog warnings in work completion handlers svcrdma: Squelch compiler warning when SUNRPC_DEBUG is disabled svcrdma: Use struct_size() in kmalloc() svcrpc: fix unlikely races preventing queueing of sockets svcrpc: svc_xprt_has_something_to_do seems a little long SUNRPC: Don't allow compiler optimisation of svc_xprt_release_slot() nfsd: fix an IS_ERR() vs NULL check
2019-03-12Merge tag 'ext4_for_linus' of ↵Linus Torvalds18-147/+257
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "A large number of bug fixes and cleanups. One new feature to allow users to more easily find the jbd2 journal thread for a particular ext4 file system" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (25 commits) jbd2: jbd2_get_transaction does not need to return a value jbd2: fix invalid descriptor block checksum ext4: fix bigalloc cluster freeing when hole punching under load ext4: add sysfs attr /sys/fs/ext4/<disk>/journal_task ext4: Change debugging support help prefix from EXT4 to Ext4 ext4: fix compile error when using BUFFER_TRACE jbd2: fix compile warning when using JBUFFER_TRACE ext4: fix some error pointer dereferences ext4: annotate more implicit fall throughs ext4: annotate implicit fall throughs ext4: don't update s_rev_level if not required jbd2: fold jbd2_superblock_csum_{verify,set} into their callers jbd2: fix race when writing superblock ext4: fix crash during online resizing ext4: disallow files with EXT4_JOURNAL_DATA_FL from EXT4_IOC_SWAP_BOOT ext4: add mask of ext4 flags to swap ext4: update quota information while swapping boot loader inode ext4: cleanup pagecache before swap i_data ext4: fix check of inode in swap_inode_boot_loader ext4: unlock unused_pages timely when doing writeback ...
2019-03-12Merge branch 'mlx4-fixes'David S. Miller2-3/+12
Tariq Toukan says: ==================== mlx4_core misc fixes This patchset by Jack contains misc fixes to the mlx4 Core driver. Patch 1 fixes a use-after-free situation by marking (nullifying) the pointer, please queue for -stable >= v4.0. Patch 2 adds a missing lock acquire and release in SRIOV command interface, please queue for -stable >= v4.9. Patch 3 avoids calling roundup_pow_of_two when argument is zero, please queue for -stable >= v3.3. Series generated against net commit: a3b1933d34d5 Merge tag 'mlx5-fixes-2019-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-12net/mlx4_core: Fix qp mtt size calculationJack Morgenstein1-3/+3
Calculation of qp mtt size (in function mlx4_RST2INIT_wrapper) ultimately depends on function roundup_pow_of_two. If the amount of memory required by the QP is less than one page, roundup_pow_of_two is called with argument zero. In this case, the roundup_pow_of_two result is undefined. Calling roundup_pow_of_two with a zero argument resulted in the following stack trace: UBSAN: Undefined behaviour in ./include/linux/log2.h:61:13 shift exponent 64 is too large for 64-bit type 'long unsigned int' CPU: 4 PID: 26939 Comm: rping Tainted: G OE 4.19.0-rc1 Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015 Call Trace: dump_stack+0x9a/0xeb ubsan_epilogue+0x9/0x7c __ubsan_handle_shift_out_of_bounds+0x254/0x29d ? __ubsan_handle_load_invalid_value+0x180/0x180 ? debug_show_all_locks+0x310/0x310 ? sched_clock+0x5/0x10 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x260 ? find_held_lock+0x35/0x1e0 ? mlx4_RST2INIT_QP_wrapper+0xfb1/0x1440 [mlx4_core] mlx4_RST2INIT_QP_wrapper+0xfb1/0x1440 [mlx4_core] Fix this by explicitly testing for zero, and returning one if the argument is zero (assuming that the next higher power of 2 in this case should be one). Fixes: c82e9aa0a8bc ("mlx4_core: resource tracking for HCA resources used by guests") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-12net/mlx4_core: Fix locking in SRIOV mode when switching between events and ↵Jack Morgenstein1-0/+8
polling In procedures mlx4_cmd_use_events() and mlx4_cmd_use_polling(), we need to guarantee that there are no FW commands in progress on the comm channel (for VFs) or wrapped FW commands (on the PF) when SRIOV is active. We do this by also taking the slave_cmd_mutex when SRIOV is active. This is especially important when switching from event to polling, since we free the command-context array during the switch. If there are FW commands in progress (e.g., waiting for a completion event), the completion event handler will access freed memory. Since the decision to use comm_wait or comm_poll is taken before grabbing the event_sem/poll_sem in mlx4_comm_cmd_wait/poll, we must take the slave_cmd_mutex as well (to guarantee that the decision to use events or polling and the call to the appropriate cmd function are atomic). Fixes: a7e1f04905e5 ("net/mlx4_core: Fix deadlock when switching between polling and event fw commands") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-12net/mlx4_core: Fix reset flow when in command polling modeJack Morgenstein1-0/+1
As part of unloading a device, the driver switches from FW command event mode to FW command polling mode. Part of switching over to polling mode is freeing the command context array memory (unfortunately, currently, without NULLing the command context array pointer). The reset flow calls "complete" to complete all outstanding fw commands (if we are in event mode). The check for event vs. polling mode here is to test if the command context array pointer is NULL. If the reset flow is activated after the switch to polling mode, it will attempt (incorrectly) to complete all the commands in the context array -- because the pointer was not NULLed when the driver switched over to polling mode. As a result, we have a use-after-free situation, which results in a kernel crash. For example: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff876c4a8e>] __wake_up_common+0x2e/0x90 PGD 0 Oops: 0000 [#1] SMP Modules linked in: netconsole nfsv3 nfs_acl nfs lockd grace ... CPU: 2 PID: 940 Comm: kworker/2:3 Kdump: loaded Not tainted 3.10.0-862.el7.x86_64 #1 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 04/28/2016 Workqueue: events hv_eject_device_work [pci_hyperv] task: ffff8d1734ca0fd0 ti: ffff8d17354bc000 task.ti: ffff8d17354bc000 RIP: 0010:[<ffffffff876c4a8e>] [<ffffffff876c4a8e>] __wake_up_common+0x2e/0x90 RSP: 0018:ffff8d17354bfa38 EFLAGS: 00010082 RAX: 0000000000000000 RBX: ffff8d17362d42c8 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8d17362d42c8 RBP: ffff8d17354bfa70 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000298 R11: ffff8d173610e000 R12: ffff8d17362d42d0 R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff8d1802680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000f16d8000 CR4: 00000000001406e0 Call Trace: [<ffffffff876c7adc>] complete+0x3c/0x50 [<ffffffffc04242f0>] mlx4_cmd_wake_completions+0x70/0x90 [mlx4_core] [<ffffffffc041e7b1>] mlx4_enter_error_state+0xe1/0x380 [mlx4_core] [<ffffffffc041fa4b>] mlx4_comm_cmd+0x29b/0x360 [mlx4_core] [<ffffffffc041ff51>] __mlx4_cmd+0x441/0x920 [mlx4_core] [<ffffffff877f62b1>] ? __slab_free+0x81/0x2f0 [<ffffffff87951384>] ? __radix_tree_lookup+0x84/0xf0 [<ffffffffc043a8eb>] mlx4_free_mtt_range+0x5b/0xb0 [mlx4_core] [<ffffffffc043a957>] mlx4_mtt_cleanup+0x17/0x20 [mlx4_core] [<ffffffffc04272c7>] mlx4_free_eq+0xa7/0x1c0 [mlx4_core] [<ffffffffc042803e>] mlx4_cleanup_eq_table+0xde/0x130 [mlx4_core] [<ffffffffc0433e08>] mlx4_unload_one+0x118/0x300 [mlx4_core] [<ffffffffc0434191>] mlx4_remove_one+0x91/0x1f0 [mlx4_core] The fix is to set the command context array pointer to NULL after freeing the array. Fixes: f5aef5aa3506 ("net/mlx4_core: Activate reset flow upon fatal command cases") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-12Merge tag 'ceph-for-5.1-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds15-426/+1597
Pull ceph updates from Ilya Dryomov: "The highlights are: - rbd will now ignore discards that aren't aligned and big enough to actually free up some space (myself). This is controlled by the new alloc_size map option and can be disabled if needed. - support for rbd deep-flatten feature (myself). Deep-flatten allows "rbd flatten" to fully disconnect the clone image and its snapshots from the parent and make the parent snapshot removable. - a new round of cap handling improvements (Zheng Yan). The kernel client should now be much more prompt about releasing its caps and it is possible to put a limit on the number of caps held. - support for getting ceph.dir.pin extended attribute (Zheng Yan)" * tag 'ceph-for-5.1-rc1' of git://github.com/ceph/ceph-client: (26 commits) Documentation: modern versions of ceph are not backed by btrfs rbd: advertise support for RBD_FEATURE_DEEP_FLATTEN rbd: whole-object write and zeroout should copyup when snapshots exist rbd: copyup with an empty snapshot context (aka deep-copyup) rbd: introduce rbd_obj_issue_copyup_ops() rbd: stop copying num_osd_ops in rbd_obj_issue_copyup() rbd: factor out __rbd_osd_req_create() rbd: clear ->xferred on error from rbd_obj_issue_copyup() rbd: remove experimental designation from kernel layering ceph: add mount option to limit caps count ceph: periodically trim stale dentries ceph: delete stale dentry when last reference is dropped ceph: remove dentry_lru file from debugfs ceph: touch existing cap when handling reply ceph: pass inclusive lend parameter to filemap_write_and_wait_range() rbd: round off and ignore discards that are too small rbd: handle DISCARD and WRITE_ZEROES separately rbd: get rid of obj_req->obj_request_count libceph: use struct_size() for kmalloc() in crush_decode() ceph: send cap releases more aggressively ...
2019-03-12Merge branch 'mlxsw-Various-fixes'David S. Miller2-10/+29
Ido Schimmel says: ==================== mlxsw: Various fixes Patch #1 fixes the recently introduced QSFP thermal zones to correctly work with split ports, where several ports are mapped to the same module. Patch #2 initializes the base MAC in the minimal driver. The driver is using the base MAC as its parent ID and without initializing it, it is reported as all zeroes to user space. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-12mlxsw: minimal: Initialize base_macJiri Pirko1-0/+18
Currently base_mac is not initialized which causes wrong reporting of zeroed parent_id to userspace. Fix this by initializing base_mac properly. Fixes: c100e47caa8e ("mlxsw: minimal: Add ethtool support") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-12mlxsw: core: Prevent duplication during QSFP module initializationVadim Pasternak1-10/+11
Verify during thermal initialization if QSFP module's entry is already configured in order to prevent duplication. Such scenario could happen in case two switch drivers (PCI and I2C based) coexist and if after boot, splitting configuration is applied for some ports and then I2C based driver is re-probed. In such case after reboot same QSFP module, associated with split will be discovered by I2C based driver few times, and it will cause a crash. It could happen for example on system equipped with BMC (Baseboard Management Controller), running I2C based driver, when the next steps are performed: - System boot - Host side configures port spilt. - BMC side is rebooted. Fixes: 6a79507cfe94 ("mlxsw: core: Extend thermal module with per QSFP module thermal zones") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-12Merge tag 'for-5.1-part2-tag' of ↵Linus Torvalds7-38/+90
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Correctness and a deadlock fixes" * tag 'for-5.1-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zstd: ensure reclaim timer is properly cleaned up btrfs: move ulist allocation out of transaction in quota enable btrfs: save drop_progress if we drop refs at all btrfs: check for refs on snapshot delete resume Btrfs: fix deadlock between clone/dedupe and rename Btrfs: fix corruption reading shared and compressed extents after hole punching
2019-03-12net: dwmac-sun8i: fix a missing check of of_get_phy_modeKangjie Lu1-1/+4
of_get_phy_mode may fail and return a negative error code; the fix checks the return value of of_get_phy_mode and returns -EINVAL of it fails. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Acked-by: Maxime Ripard <maxime.ripard@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>