summaryrefslogtreecommitdiffstats
path: root/include/trace
AgeCommit message (Collapse)AuthorFilesLines
2020-06-03Merge branch 'akpm' (patches from Andrew)Linus Torvalds3-15/+24
Merge more updates from Andrew Morton: "More mm/ work, plenty more to come Subsystems affected by this patch series: slub, memcg, gup, kasan, pagealloc, hugetlb, vmscan, tools, mempolicy, memblock, hugetlbfs, thp, mmap, kconfig" * akpm: (131 commits) arm64: mm: use ARCH_HAS_DEBUG_WX instead of arch defined x86: mm: use ARCH_HAS_DEBUG_WX instead of arch defined riscv: support DEBUG_WX mm: add DEBUG_WX support drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup mm/thp: rename pmd_mknotpresent() as pmd_mkinvalid() powerpc/mm: drop platform defined pmd_mknotpresent() mm: thp: don't need to drain lru cache when splitting and mlocking THP hugetlbfs: get unmapped area below TASK_UNMAPPED_BASE for hugetlbfs sparc32: register memory occupied by kernel as memblock.memory include/linux/memblock.h: fix minor typo and unclear comment mm, mempolicy: fix up gup usage in lookup_node tools/vm/page_owner_sort.c: filter out unneeded line mm: swap: memcg: fix memcg stats for huge pages mm: swap: fix vmstats for huge pages mm: vmscan: limit the range of LRU type balancing mm: vmscan: reclaim writepage is IO cost mm: vmscan: determine anon/file pressure balance at the reclaim root mm: balance LRU lists based on relative thrashing mm: only count actual rotations as LRU reclaim cost ...
2020-06-03khugepaged: introduce 'max_ptes_shared' tunableKirill A. Shutemov1-1/+2
'max_ptes_shared' specifies how many pages can be shared across multiple processes. Exceeding the number would block the collapse:: /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_shared A higher value may increase memory footprint for some workloads. By default, at least half of pages has to be not shared. [colin.king@canonical.com: fix several spelling mistakes] Link: http://lkml.kernel.org/r/20200420084241.65433-1-colin.king@canonical.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Zi Yan <ziy@nvidia.com> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Link: http://lkml.kernel.org/r/20200416160026.16538-9-kirill.shutemov@linux.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03mm/page_alloc: integrate classzone_idx and high_zoneidxJoonsoo Kim2-14/+22
classzone_idx is just different name for high_zoneidx now. So, integrate them and add some comment to struct alloc_context in order to reduce future confusion about the meaning of this variable. The accessor, ac_classzone_idx() is also removed since it isn't needed after integration. In addition to integration, this patch also renames high_zoneidx to highest_zoneidx since it represents more precise meaning. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Baoquan He <bhe@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/1587095923-7515-3-git-send-email-iamjoonsoo.kim@lge.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds3-1/+191
Pull networking updates from David Miller: 1) Allow setting bluetooth L2CAP modes via socket option, from Luiz Augusto von Dentz. 2) Add GSO partial support to igc, from Sasha Neftin. 3) Several cleanups and improvements to r8169 from Heiner Kallweit. 4) Add IF_OPER_TESTING link state and use it when ethtool triggers a device self-test. From Andrew Lunn. 5) Start moving away from custom driver versions, use the globally defined kernel version instead, from Leon Romanovsky. 6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin. 7) Allow hard IRQ deferral during NAPI, from Eric Dumazet. 8) Add sriov and vf support to hinic, from Luo bin. 9) Support Media Redundancy Protocol (MRP) in the bridging code, from Horatiu Vultur. 10) Support netmap in the nft_nat code, from Pablo Neira Ayuso. 11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina Dubroca. Also add ipv6 support for espintcp. 12) Lots of ReST conversions of the networking documentation, from Mauro Carvalho Chehab. 13) Support configuration of ethtool rxnfc flows in bcmgenet driver, from Doug Berger. 14) Allow to dump cgroup id and filter by it in inet_diag code, from Dmitry Yakunin. 15) Add infrastructure to export netlink attribute policies to userspace, from Johannes Berg. 16) Several optimizations to sch_fq scheduler, from Eric Dumazet. 17) Fallback to the default qdisc if qdisc init fails because otherwise a packet scheduler init failure will make a device inoperative. From Jesper Dangaard Brouer. 18) Several RISCV bpf jit optimizations, from Luke Nelson. 19) Correct the return type of the ->ndo_start_xmit() method in several drivers, it's netdev_tx_t but many drivers were using 'int'. From Yunjian Wang. 20) Add an ethtool interface for PHY master/slave config, from Oleksij Rempel. 21) Add BPF iterators, from Yonghang Song. 22) Add cable test infrastructure, including ethool interfaces, from Andrew Lunn. Marvell PHY driver is the first to support this facility. 23) Remove zero-length arrays all over, from Gustavo A. R. Silva. 24) Calculate and maintain an explicit frame size in XDP, from Jesper Dangaard Brouer. 25) Add CAP_BPF, from Alexei Starovoitov. 26) Support terse dumps in the packet scheduler, from Vlad Buslov. 27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei. 28) Add devm_register_netdev(), from Bartosz Golaszewski. 29) Minimize qdisc resets, from Cong Wang. 30) Get rid of kernel_getsockopt and kernel_setsockopt in order to eliminate set_fs/get_fs calls. From Christoph Hellwig. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits) selftests: net: ip_defrag: ignore EPERM net_failover: fixed rollback in net_failover_open() Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv" Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv" vmxnet3: allow rx flow hash ops only when rss is enabled hinic: add set_channels ethtool_ops support selftests/bpf: Add a default $(CXX) value tools/bpf: Don't use $(COMPILE.c) bpf, selftests: Use bpf_probe_read_kernel s390/bpf: Use bcr 0,%0 as tail call nop filler s390/bpf: Maintain 8-byte stack alignment selftests/bpf: Fix verifier test selftests/bpf: Fix sample_cnt shared between two threads bpf, selftests: Adapt cls_redirect to call csum_level helper bpf: Add csum_level helper for fixing up csum levels bpf: Fix up bpf_skb_adjust_room helper's skb csum setting sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf() crypto/chtls: IPv6 support for inline TLS Crypto/chcr: Fixes a coccinile check error Crypto/chcr: Fixes compilations warnings ...
2020-06-02Merge tag 'for-5.8-tag' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "Highlights: - speedup dead root detection during orphan cleanup, eg. when there are many deleted subvolumes waiting to be cleaned, the trees are now looked up in radix tree instead of a O(N^2) search - snapshot creation with inherited qgroup will mark the qgroup inconsistent, requires a rescan - send will emit file capabilities after chown, this produces a stream that does not need postprocessing to set the capabilities again - direct io ported to iomap infrastructure, cleaned up and simplified code, notably removing last use of struct buffer_head in btrfs code Core changes: - factor out backreference iteration, to be used by ordinary backreferences and relocation code - improved global block reserve utilization * better logic to serialize requests * increased maximum available for unlink * improved handling on large pages (64K) - direct io cleanups and fixes * simplify layering, where cloned bios were unnecessarily created for some cases * error handling fixes (submit, endio) * remove repair worker thread, used to avoid deadlocks during repair - refactored block group reading code, preparatory work for new type of block group storage that should improve mount time on large filesystems Cleanups: - cleaned up (and slightly sped up) set/get helpers for metadata data structure members - root bit REF_COWS got renamed to SHAREABLE to reflect the that the blocks of the tree get shared either among subvolumes or with the relocation trees Fixes: - when subvolume deletion fails due to ENOSPC, the filesystem is not turned read-only - device scan deals with devices from other filesystems that changed ownership due to overwrite (mkfs) - fix a race between scrub and block group removal/allocation - fix long standing bug of a runaway balance operation, printing the same line to the syslog, caused by a stale status bit on a reloc tree that prevented progress - fix corrupt log due to concurrent fsync of inodes with shared extents - fix space underflow for NODATACOW and buffered writes when it for some reason needs to fallback to COW mode" * tag 'for-5.8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (133 commits) btrfs: fix space_info bytes_may_use underflow during space cache writeout btrfs: fix space_info bytes_may_use underflow after nocow buffered write btrfs: fix wrong file range cleanup after an error filling dealloc range btrfs: remove redundant local variable in read_block_for_search btrfs: open code key_search btrfs: split btrfs_direct_IO to read and write part btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK fs: remove dio_end_io() btrfs: switch to iomap_dio_rw() for dio iomap: remove lockdep_assert_held() iomap: add a filesystem hook for direct I/O bio submission fs: export generic_file_buffered_read() btrfs: turn space cache writeout failure messages into debug messages btrfs: include error on messages about failure to write space/inode caches btrfs: remove useless 'fail_unlock' label from btrfs_csum_file_blocks() btrfs: do not ignore error from btrfs_next_leaf() when inserting checksums btrfs: make checksum item extension more efficient btrfs: fix corrupt log due to concurrent fsync of inodes with shared extents btrfs: unexport btrfs_compress_set_level() btrfs: simplify iget helpers ...
2020-06-02mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK insteadNeilBrown1-4/+1
After an NFS page has been written it is considered "unstable" until a COMMIT request succeeds. If the COMMIT fails, the page will be re-written. These "unstable" pages are currently accounted as "reclaimable", either in WB_RECLAIMABLE, or in NR_UNSTABLE_NFS which is included in a 'reclaimable' count. This might have made sense when sending the COMMIT required a separate action by the VFS/MM (e.g. releasepage() used to send a COMMIT). However now that all writes generated by ->writepages() will automatically be followed by a COMMIT (since commit 919e3bd9a875 ("NFS: Ensure we commit after writeback is complete")) it makes more sense to treat them as writeback pages. So this patch removes NR_UNSTABLE_NFS and accounts unstable pages in NR_WRITEBACK and WB_WRITEBACK. A particular effect of this change is that when wb_check_background_flush() calls wb_over_bg_threshold(), the latter will report 'true' a lot less often as the 'unstable' pages are no longer considered 'dirty' (as there is nothing that writeback can do about them anyway). Currently wb_check_background_flush() will trigger writeback to NFS even when there are relatively few dirty pages (if there are lots of unstable pages), this can result in small writes going to the server (10s of Kilobytes rather than a Megabyte) which hurts throughput. With this patch, there are fewer writes which are each larger on average. Where the NR_UNSTABLE_NFS count was included in statistics virtual-files, the entry is retained, but the value is hard-coded as zero. static trace points and warning printks which mentioned this counter no longer report it. [akpm@linux-foundation.org: re-layout comment] [akpm@linux-foundation.org: fix printk warning] Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@hammerspace.com> Acked-by: Michal Hocko <mhocko@suse.com> [mm] Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Link: http://lkml.kernel.org/r/87d06j7gqa.fsf@notabene.neil.brown.name Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02f2fs: convert from readpages to readaheadMatthew Wilcox (Oracle)1-3/+3
Use the new readahead operation in f2fs Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-23-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02erofs: convert uncompressed files from readpages to readaheadMatthew Wilcox (Oracle)1-3/+3
Use the new readahead operation in erofs Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Acked-by: Gao Xiang <gaoxiang25@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-19-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-01Merge remote-tracking branch 'regulator/for-5.8' into regulator-linusMark Brown1-0/+32
2020-05-29regulator: core: Add regulator bypass trace pointsCharles Keepax1-0/+32
Add new trace points for the start and end of enabling bypass on a regulator, to allow monitoring of when regulators are moved into bypass and how long that takes. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20200529152216.9671-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-27net_sched: add a tracepoint for qdisc creationCong Wang1-0/+23
With this tracepoint, we could know when qdisc's are created, especially those default qdisc's. Sample output: tc-736 [001] ...1 56.230107: qdisc_create: dev=ens3 kind=pfifo parent=1:0 tc-736 [001] ...1 56.230113: qdisc_create: dev=ens3 kind=hfsc parent=ffff:ffff tc-738 [001] ...1 56.256816: qdisc_create: dev=ens3 kind=pfifo parent=1:100 tc-739 [001] ...1 56.267584: qdisc_create: dev=ens3 kind=pfifo parent=1:200 tc-740 [001] ...1 56.279649: qdisc_create: dev=ens3 kind=fq_codel parent=1:100 tc-741 [001] ...1 56.289996: qdisc_create: dev=ens3 kind=pfifo_fast parent=1:200 tc-745 [000] .N.1 111.687483: qdisc_create: dev=ens3 kind=ingress parent=ffff:fff1 Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27net_sched: add tracepoints for qdisc_reset() and qdisc_destroy()Cong Wang1-0/+52
Add two tracepoints for qdisc_reset() and qdisc_destroy() to track qdisc resetting and destroying. Sample output: tc-756 [000] ...3 138.355662: qdisc_reset: dev=ens3 kind=pfifo_fast parent=ffff:ffff handle=0:0 tc-756 [000] ...1 138.355720: qdisc_reset: dev=ens3 kind=pfifo_fast parent=ffff:ffff handle=0:0 tc-756 [000] ...1 138.355867: qdisc_reset: dev=ens3 kind=pfifo_fast parent=ffff:ffff handle=0:0 tc-756 [000] ...1 138.355930: qdisc_destroy: dev=ens3 kind=pfifo_fast parent=ffff:ffff handle=0:0 tc-757 [000] ...2 143.073780: qdisc_reset: dev=ens3 kind=fq_codel parent=ffff:ffff handle=8001:0 tc-757 [000] ...1 143.073878: qdisc_reset: dev=ens3 kind=fq_codel parent=ffff:ffff handle=8001:0 tc-757 [000] ...1 143.074114: qdisc_reset: dev=ens3 kind=fq_codel parent=ffff:ffff handle=8001:0 tc-757 [000] ...1 143.074228: qdisc_destroy: dev=ens3 kind=fq_codel parent=ffff:ffff handle=8001:0 Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25btrfs: fix corrupt log due to concurrent fsync of inodes with shared extentsFilipe Manana1-0/+1
When we have extents shared amongst different inodes in the same subvolume, if we fsync them in parallel we can end up with checksum items in the log tree that represent ranges which overlap. For example, consider we have inodes A and B, both sharing an extent that covers the logical range from X to X + 64KiB: 1) Task A starts an fsync on inode A; 2) Task B starts an fsync on inode B; 3) Task A calls btrfs_csum_file_blocks(), and the first search in the log tree, through btrfs_lookup_csum(), returns -EFBIG because it finds an existing checksum item that covers the range from X - 64KiB to X; 4) Task A checks that the checksum item has not reached the maximum possible size (MAX_CSUM_ITEMS) and then releases the search path before it does another path search for insertion (through a direct call to btrfs_search_slot()); 5) As soon as task A releases the path and before it does the search for insertion, task B calls btrfs_csum_file_blocks() and gets -EFBIG too, because there is an existing checksum item that has an end offset that matches the start offset (X) of the checksum range we want to log; 6) Task B releases the path; 7) Task A does the path search for insertion (through btrfs_search_slot()) and then verifies that the checksum item that ends at offset X still exists and extends its size to insert the checksums for the range from X to X + 64KiB; 8) Task A releases the path and returns from btrfs_csum_file_blocks(), having inserted the checksums into an existing checksum item that got its size extended. At this point we have one checksum item in the log tree that covers the logical range from X - 64KiB to X + 64KiB; 9) Task B now does a search for insertion using btrfs_search_slot() too, but it finds that the previous checksum item no longer ends at the offset X, it now ends at an of offset X + 64KiB, so it leaves that item untouched. Then it releases the path and calls btrfs_insert_empty_item() that inserts a checksum item with a key offset corresponding to X and a size for inserting a single checksum (4 bytes in case of crc32c). Subsequent iterations end up extending this new checksum item so that it contains the checksums for the range from X to X + 64KiB. So after task B returns from btrfs_csum_file_blocks() we end up with two checksum items in the log tree that have overlapping ranges, one for the range from X - 64KiB to X + 64KiB, and another for the range from X to X + 64KiB. Having checksum items that represent ranges which overlap, regardless of being in the log tree or in the chekcsums tree, can lead to problems where checksums for a file range end up not being found. This type of problem has happened a few times in the past and the following commits fixed them and explain in detail why having checksum items with overlapping ranges is problematic: 27b9a8122ff71a "Btrfs: fix csum tree corruption, duplicate and outdated checksums" b84b8390d6009c "Btrfs: fix file read corruption after extent cloning and fsync" 40e046acbd2f36 "Btrfs: fix missing data checksums after replaying a log tree" Since this specific instance of the problem can only happen when logging inodes, because it is the only case where concurrent attempts to insert checksums for the same range can happen, fix the issue by using an extent io tree as a range lock to serialize checksum insertion during inode logging. This issue could often be reproduced by the test case generic/457 from fstests. When it happens it produces the following trace: BTRFS critical (device dm-0): corrupt leaf: root=18446744073709551610 block=30625792 slot=42, csum end range (15020032) goes beyond the start range (15015936) of the next csum item BTRFS info (device dm-0): leaf 30625792 gen 7 total ptrs 49 free space 2402 owner 18446744073709551610 BTRFS info (device dm-0): refs 1 lock (w:0 r:0 bw:0 br:0 sw:0 sr:0) lock_owner 0 current 15884 item 0 key (18446744073709551606 128 13979648) itemoff 3991 itemsize 4 item 1 key (18446744073709551606 128 13983744) itemoff 3987 itemsize 4 item 2 key (18446744073709551606 128 13987840) itemoff 3983 itemsize 4 item 3 key (18446744073709551606 128 13991936) itemoff 3979 itemsize 4 item 4 key (18446744073709551606 128 13996032) itemoff 3975 itemsize 4 item 5 key (18446744073709551606 128 14000128) itemoff 3971 itemsize 4 (...) BTRFS error (device dm-0): block=30625792 write time tree block corruption detected ------------[ cut here ]------------ WARNING: CPU: 1 PID: 15884 at fs/btrfs/disk-io.c:539 btree_csum_one_bio+0x268/0x2d0 [btrfs] Modules linked in: btrfs dm_thin_pool ... CPU: 1 PID: 15884 Comm: fsx Tainted: G W 5.6.0-rc7-btrfs-next-58 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 RIP: 0010:btree_csum_one_bio+0x268/0x2d0 [btrfs] Code: c7 c7 ... RSP: 0018:ffffbb0109e6f8e0 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffffe1c0847b6080 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffffaa963988 RDI: 0000000000000001 RBP: ffff956a4f4d2000 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000526 R11: 0000000000000000 R12: ffff956a5cd28bb0 R13: 0000000000000000 R14: ffff956a649c9388 R15: 000000011ed82000 FS: 00007fb419959e80(0000) GS:ffff956a7aa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000fe6d54 CR3: 0000000138696005 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: btree_submit_bio_hook+0x67/0xc0 [btrfs] submit_one_bio+0x31/0x50 [btrfs] btree_write_cache_pages+0x2db/0x4b0 [btrfs] ? __filemap_fdatawrite_range+0xb1/0x110 do_writepages+0x23/0x80 __filemap_fdatawrite_range+0xd2/0x110 btrfs_write_marked_extents+0x15e/0x180 [btrfs] btrfs_sync_log+0x206/0x10a0 [btrfs] ? kmem_cache_free+0x315/0x3b0 ? btrfs_log_inode+0x1e8/0xf90 [btrfs] ? __mutex_unlock_slowpath+0x45/0x2a0 ? lockref_put_or_lock+0x9/0x30 ? dput+0x2d/0x580 ? dput+0xb5/0x580 ? btrfs_sync_file+0x464/0x4d0 [btrfs] btrfs_sync_file+0x464/0x4d0 [btrfs] do_fsync+0x38/0x60 __x64_sys_fsync+0x10/0x20 do_syscall_64+0x5c/0x280 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fb41953a6d0 Code: 48 3d ... RSP: 002b:00007ffcc86bd218 EFLAGS: 00000246 ORIG_RAX: 000000000000004a RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fb41953a6d0 RDX: 0000000000000009 RSI: 0000000000040000 RDI: 0000000000000003 RBP: 0000000000040000 R08: 0000000000000001 R09: 0000000000000009 R10: 0000000000000064 R11: 0000000000000246 R12: 0000556cf4b2c060 R13: 0000000000000100 R14: 0000000000000000 R15: 0000556cf322b420 irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<ffffffffa96bdedf>] copy_process+0x74f/0x2020 softirqs last enabled at (0): [<ffffffffa96bdedf>] copy_process+0x74f/0x2020 softirqs last disabled at (0): [<0000000000000000>] 0x0 ---[ end trace d543fc76f5ad7fd8 ]--- In that trace the tree checker detected the overlapping checksum items at the time when we triggered writeback for the log tree when syncing the log. Another trace that can happen is due to BUG_ON() when deleting checksum items while logging an inode: BTRFS critical (device dm-0): slot 81 key (18446744073709551606 128 13635584) new key (18446744073709551606 128 13635584) BTRFS info (device dm-0): leaf 30949376 gen 7 total ptrs 98 free space 8527 owner 18446744073709551610 BTRFS info (device dm-0): refs 4 lock (w:1 r:0 bw:0 br:0 sw:1 sr:0) lock_owner 13473 current 13473 item 0 key (257 1 0) itemoff 16123 itemsize 160 inode generation 7 size 262144 mode 100600 item 1 key (257 12 256) itemoff 16103 itemsize 20 item 2 key (257 108 0) itemoff 16050 itemsize 53 extent data disk bytenr 13631488 nr 4096 extent data offset 0 nr 131072 ram 131072 (...) ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.c:3153! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI CPU: 1 PID: 13473 Comm: fsx Not tainted 5.6.0-rc7-btrfs-next-58 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 RIP: 0010:btrfs_set_item_key_safe+0x1ea/0x270 [btrfs] Code: 0f b6 ... RSP: 0018:ffff95e3889179d0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000051 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffffb7763988 RDI: 0000000000000001 RBP: fffffffffffffff6 R08: 0000000000000000 R09: 0000000000000001 R10: 00000000000009ef R11: 0000000000000000 R12: ffff8912a8ba5a08 R13: ffff95e388917a06 R14: ffff89138dcf68c8 R15: ffff95e388917ace FS: 00007fe587084e80(0000) GS:ffff8913baa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe587091000 CR3: 0000000126dac005 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: btrfs_del_csums+0x2f4/0x540 [btrfs] copy_items+0x4b5/0x560 [btrfs] btrfs_log_inode+0x910/0xf90 [btrfs] btrfs_log_inode_parent+0x2a0/0xe40 [btrfs] ? dget_parent+0x5/0x370 btrfs_log_dentry_safe+0x4a/0x70 [btrfs] btrfs_sync_file+0x42b/0x4d0 [btrfs] __x64_sys_msync+0x199/0x200 do_syscall_64+0x5c/0x280 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fe586c65760 Code: 00 f7 ... RSP: 002b:00007ffe250f98b8 EFLAGS: 00000246 ORIG_RAX: 000000000000001a RAX: ffffffffffffffda RBX: 00000000000040e1 RCX: 00007fe586c65760 RDX: 0000000000000004 RSI: 0000000000006b51 RDI: 00007fe58708b000 RBP: 0000000000006a70 R08: 0000000000000003 R09: 00007fe58700cb61 R10: 0000000000000100 R11: 0000000000000246 R12: 00000000000000e1 R13: 00007fe58708b000 R14: 0000000000006b51 R15: 0000558de021a420 Modules linked in: dm_log_writes ... ---[ end trace c92a7f447a8515f5 ]--- CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-10/+42
The MSCC bug fix in 'net' had to be slightly adjusted because the register accesses are done slightly differently in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22Merge tag 'rxrpc-fixes-20200520' of ↵David S. Miller1-10/+42
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Fix retransmission timeout and ACK discard Here are a couple of fixes and an extra tracepoint for AF_RXRPC: (1) Calculate the RTO pretty much as TCP does, rather than making something up, including an initial 4s timeout (which causes return probes from the fileserver to fail if a packet goes missing), and add backoff. (2) Fix the discarding of out-of-order received ACKs. We mustn't let the hard-ACK point regress, nor do we want to do unnecessary retransmission because the soft-ACK list regresses. This is not trivial, however, due to some loose wording in various old protocol specs, the ACK field that should be used for this sometimes has the wrong information in it. (3) Add a tracepoint to log a discarded ACK. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21xsk: Remove MEM_TYPE_ZERO_COPY and corresponding codeBjörn Töpel1-1/+0
There are no users of MEM_TYPE_ZERO_COPY. Remove all corresponding code, including the "handle" member of struct xdp_buff. rfc->v1: Fixed spelling in commit message. (Björn) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-13-bjorn.topel@gmail.com
2020-05-21xsk: Introduce AF_XDP buffer allocation APIBjörn Töpel1-1/+2
In order to simplify AF_XDP zero-copy enablement for NIC driver developers, a new AF_XDP buffer allocation API is added. The implementation is based on a single core (single producer/consumer) buffer pool for the AF_XDP UMEM. A buffer is allocated using the xsk_buff_alloc() function, and returned using xsk_buff_free(). If a buffer is disassociated with the pool, e.g. when a buffer is passed to an AF_XDP socket, a buffer is said to be released. Currently, the release function is only used by the AF_XDP internals and not visible to the driver. Drivers using this API should register the XDP memory model with the new MEM_TYPE_XSK_BUFF_POOL type. The API is defined in net/xdp_sock_drv.h. The buffer type is struct xdp_buff, and follows the lifetime of regular xdp_buffs, i.e. the lifetime of an xdp_buff is restricted to a NAPI context. In other words, the API is not replacing xdp_frames. In addition to introducing the API and implementations, the AF_XDP core is migrated to use the new APIs. rfc->v1: Fixed build errors/warnings for m68k and riscv. (kbuild test robot) Added headroom/chunk size getter. (Maxim/Björn) v1->v2: Swapped SoBs. (Maxim) v2->v3: Initialize struct xdp_buff member frame_sz. (Björn) Add API to query the DMA address of a frame. (Maxim) Do DMA sync for CPU till the end of the frame to handle possible growth (frame_sz). (Maxim) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-6-bjorn.topel@gmail.com
2020-05-20rxrpc: Trace discarded ACKsDavid Howells1-0/+35
Add a tracepoint to track received ACKs that are discarded due to being outside of the Tx window. Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2-5/+5
Move the bpf verifier trace check into the new switch statement in HEAD. Resolve the overlapping changes in hinic, where bug fixes overlap the addition of VF support. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-11rxrpc: Fix the excessive initial retransmission timeoutDavid Howells1-10/+7
rxrpc currently uses a fixed 4s retransmission timeout until the RTT is sufficiently sampled. This can cause problems with some fileservers with calls to the cache manager in the afs filesystem being dropped from the fileserver because a packet goes missing and the retransmission timeout is greater than the call expiry timeout. Fix this by: (1) Copying the RTT/RTO calculation code from Linux's TCP implementation and altering it to fit rxrpc. (2) Altering the various users of the RTT to make use of the new SRTT value. (3) Replacing the use of rxrpc_resend_timeout to use the calculated RTO value instead (which is needed in jiffies), along with a backoff. Notes: (1) rxrpc provides RTT samples by matching the serial numbers on outgoing DATA packets that have the RXRPC_REQUEST_ACK set and PING ACK packets against the reference serial number in incoming REQUESTED ACK and PING-RESPONSE ACK packets. (2) Each packet that is transmitted on an rxrpc connection gets a new per-connection serial number, even for retransmissions, so an ACK can be cross-referenced to a specific trigger packet. This allows RTT information to be drawn from retransmitted DATA packets also. (3) rxrpc maintains the RTT/RTO state on the rxrpc_peer record rather than on an rxrpc_call because many RPC calls won't live long enough to generate more than one sample. (4) The calculated SRTT value is in units of 8ths of a microsecond rather than nanoseconds. The (S)RTT and RTO values are displayed in /proc/net/rxrpc/peers. Fixes: 17926a79320a ([AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both"") Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-10Merge tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-blockLinus Torvalds1-4/+4
Pull block fixes from Jens Axboe: - a small series fixing a use-after-free of bdi name (Christoph,Yufen) - NVMe fix for a regression with the smaller CQ update (Alexey) - NVMe fix for a hang at namespace scanning error recovery (Sagi) - fix race with blk-iocost iocg->abs_vdebt updates (Tejun) * tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-block: nvme: fix possible hang when ns scanning fails during error recovery nvme-pci: fix "slimmer CQ head update" bdi: add a ->dev_name field to struct backing_dev_info bdi: use bdi_dev_name() to get device name bdi: move bdi_dev_name out of line vboxsf: don't use the source name in the bdi name iocost: protect iocg->abs_vdebt with iocg->waitq.lock
2020-05-09bdi: use bdi_dev_name() to get device nameYufen Yu1-4/+4
Use the common interface bdi_dev_name() to get device name. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Add missing <linux/backing-dev.h> include BFQ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-07Merge tag 'trace-v5.7-rc3' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Fix bootconfig causing kernels to fail with CONFIG_BLK_DEV_RAM enabled - Fix allocation leaks in bootconfig tool - Fix a double initialization of a variable - Fix API bootconfig usage from kprobe boot time events - Reject NULL location for kprobes - Fix crash caused by preempt delay module not cleaning up kthread correctly - Add vmalloc_sync_mappings() to prevent x86_64 page faults from recursively faulting from tracing page faults - Fix comment in gpu/trace kerneldoc header - Fix documentation of how to create a trace event class - Make the local tracing_snapshot_instance_cond() function static * tag 'trace-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tools/bootconfig: Fix resource leak in apply_xbc() tracing: Make tracing_snapshot_instance_cond() static tracing: Fix doc mistakes in trace sample gpu/trace: Minor comment updates for gpu_mem_total tracepoint tracing: Add a vmalloc_sync_mappings() for safe measure tracing: Wait for preempt irq delay thread to finish tracing/kprobes: Reject new event if loc is NULL tracing/boottime: Fix kprobe event API usage tracing/kprobes: Fix a double initialization typo bootconfig: Fix to remove bootconfig data from initrd while boot
2020-05-07gpu/trace: Minor comment updates for gpu_mem_total tracepointYiwei Zhang1-1/+1
This change updates the improper comment for the 'size' attribute in the tracepoint definition. Most gfx drivers pre-fault in physical pages instead of making virtual allocations. So we drop the 'Virtual' keyword here and leave this to the implementations. Link: http://lkml.kernel.org/r/20200428220825.169606-1-zzyiwei@google.com Signed-off-by: Yiwei Zhang <zzyiwei@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-8/+4
Conflicts were all overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02Merge tag 'nfs-for-5.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-8/+4
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable fixes: - fix handling of backchannel binding in BIND_CONN_TO_SESSION Bugfixes: - Fix a credential use-after-free issue in pnfs_roc() - Fix potential posix_acl refcnt leak in nfs3_set_acl - defer slow parts of rpc_free_client() to a workqueue - Fix an Oopsable race in __nfs_list_for_each_server() - Fix trace point use-after-free race - Regression: the RDMA client no longer responds to server disconnect requests - Fix return values of xdr_stream_encode_item_{present, absent} - _pnfs_return_layout() must always wait for layoutreturn completion Cleanups: - Remove unreachable error conditions" * tag 'nfs-for-5.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: Fix a race in __nfs_list_for_each_server() NFSv4.1: fix handling of backchannel binding in BIND_CONN_TO_SESSION SUNRPC: defer slow parts of rpc_free_client() to a workqueue. NFSv4: Remove unreachable error condition due to rpc_run_task() SUNRPC: Remove unreachable error condition xprtrdma: Fix use of xdr_stream_encode_item_{present, absent} xprtrdma: Fix trace point use-after-free race xprtrdma: Restore wake-up-all to rpcrdma_cm_event_handler() nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl NFS/pnfs: Fix a credential use-after-free issue in pnfs_roc() NFS/pnfs: Ensure that _pnfs_return_layout() waits for layoutreturn completion
2020-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller4-22/+43
Simple overlapping changes to linux/vermagic.h Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24Merge tag 'block-5.7-2020-04-24' of git://git.kernel.dk/linux-blockLinus Torvalds2-4/+3
Pull block fixes from Jens Axboe: "A few fixes/changes that should go into this release: - null_blk zoned fixes (Damien) - blkdev_close() sync improvement (Douglas) - Fix regression in blk-iocost that impacted (at least) systemtap (Waiman) - Comment fix, header removal (Zhiqiang, Jianpeng)" * tag 'block-5.7-2020-04-24' of git://git.kernel.dk/linux-block: null_blk: Cleanup zoned device initialization null_blk: Fix zoned command handling block: remove unused header blk-iocost: Fix error on iocost_ioc_vrate_adj bdev: Reduce time holding bd_mutex in sync in blkdev_close() buffer: remove useless comment and WB_REASON_FREE_MORE_MEM, reason.
2020-04-23Merge tag 'nfsd-5.7-rc-1' of git://git.linux-nfs.org/projects/cel/cel-2.6Linus Torvalds1-14/+36
Pull nfsd fixes from Chuck Lever: "The first set of 5.7-rc fixes for NFS server issues. These were all unresolved at the time the 5.7 window opened, and needed some additional time to ensure they were correctly addressed. They are ready now. At the moment I know of one more urgent issue regarding the NFS server. A fix has been tested and is under review. I expect to send one more pull request, containing this fix (which now consists of 3 patches). Fixes: - Address several use-after-free and memory leak bugs - Prevent a backchannel livelock" * tag 'nfsd-5.7-rc-1' of git://git.linux-nfs.org/projects/cel/cel-2.6: svcrdma: Fix leak of svc_rdma_recv_ctxt objects svcrdma: Fix trace point use-after-free race SUNRPC: Fix backchannel RPC soft lockups SUNRPC/cache: Fix unsafe traverse caused double-free in cache_purge nfsd: memory corruption in nfsd4_lock()
2020-04-22net: qrtr: Add tracepoint supportManivannan Sadhasivam1-0/+115
Add tracepoint support for QRTR with NS as the first candidate. Later on this can be extended to core QRTR and transport drivers. The trace_printk() used in NS has been replaced by tracepoints. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21blk-iocost: Fix error on iocost_ioc_vrate_adjWaiman Long1-3/+3
Systemtap 4.2 is unable to correctly interpret the "u32 (*missed_ppm)[2]" argument of the iocost_ioc_vrate_adj trace entry defined in include/trace/events/iocost.h leading to the following error: /tmp/stapAcz0G0/stap_c89c58b83cea1724e26395efa9ed4939_6321_aux_6.c:78:8: error: expected ‘;’, ‘,’ or ‘)’ before ‘*’ token , u32[]* __tracepoint_arg_missed_ppm That argument type is indeed rather complex and hard to read. Looking at block/blk-iocost.c. It is just a 2-entry u32 array. By simplifying the argument to a simple "u32 *missed_ppm" and adjusting the trace entry accordingly, the compilation error was gone. Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost") Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-20xprtrdma: Fix trace point use-after-free raceChuck Lever1-8/+4
It's not safe to use resources pointed to by the @send_wr of ib_post_send() _after_ that function returns. Those resources are typically freed by the Send completion handler, which can run before ib_post_send() returns. Thus the trace points currently around ib_post_send() in the client's RPC/RDMA transport are a hazard, even when they are disabled. Rearrange them so that they touch the Work Request only _before_ ib_post_send() is invoked. Fixes: ab03eff58eb5 ("xprtrdma: Add trace points in RPC Call transmit paths") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-04-17buffer: remove useless comment and WB_REASON_FREE_MORE_MEM, reason.Zhiqiang Liu1-1/+0
free_more_memory func has been completely removed in commit bc48f001de12 ("buffer: eliminate the need to call free_more_memory() in __getblk_slow()") So comment and `WB_REASON_FREE_MORE_MEM` reason about free_more_memory are no longer needed. Fixes: bc48f001de12 ("buffer: eliminate the need to call free_more_memory() in __getblk_slow()") Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-17svcrdma: Fix trace point use-after-free raceChuck Lever1-14/+36
I hit this while testing nfsd-5.7 with kernel memory debugging enabled on my server: Mar 30 13:21:45 klimt kernel: BUG: unable to handle page fault for address: ffff8887e6c279a8 Mar 30 13:21:45 klimt kernel: #PF: supervisor read access in kernel mode Mar 30 13:21:45 klimt kernel: #PF: error_code(0x0000) - not-present page Mar 30 13:21:45 klimt kernel: PGD 3601067 P4D 3601067 PUD 87c519067 PMD 87c3e2067 PTE 800ffff8193d8060 Mar 30 13:21:45 klimt kernel: Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI Mar 30 13:21:45 klimt kernel: CPU: 2 PID: 1933 Comm: nfsd Not tainted 5.6.0-rc6-00040-g881e87a3c6f9 #1591 Mar 30 13:21:45 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015 Mar 30 13:21:45 klimt kernel: RIP: 0010:svc_rdma_post_chunk_ctxt+0xab/0x284 [rpcrdma] Mar 30 13:21:45 klimt kernel: Code: c1 83 34 02 00 00 29 d0 85 c0 7e 72 48 8b bb a0 02 00 00 48 8d 54 24 08 4c 89 e6 48 8b 07 48 8b 40 20 e8 5a 5c 2b e1 41 89 c6 <8b> 45 20 89 44 24 04 8b 05 02 e9 01 00 85 c0 7e 33 e9 5e 01 00 00 Mar 30 13:21:45 klimt kernel: RSP: 0018:ffffc90000dfbdd8 EFLAGS: 00010286 Mar 30 13:21:45 klimt kernel: RAX: 0000000000000000 RBX: ffff8887db8db400 RCX: 0000000000000030 Mar 30 13:21:45 klimt kernel: RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000246 Mar 30 13:21:45 klimt kernel: RBP: ffff8887e6c27988 R08: 0000000000000000 R09: 0000000000000004 Mar 30 13:21:45 klimt kernel: R10: ffffc90000dfbdd8 R11: 00c068ef00000000 R12: ffff8887eb4e4a80 Mar 30 13:21:45 klimt kernel: R13: ffff8887db8db634 R14: 0000000000000000 R15: ffff8887fc931000 Mar 30 13:21:45 klimt kernel: FS: 0000000000000000(0000) GS:ffff88885bd00000(0000) knlGS:0000000000000000 Mar 30 13:21:45 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 30 13:21:45 klimt kernel: CR2: ffff8887e6c279a8 CR3: 000000081b72e002 CR4: 00000000001606e0 Mar 30 13:21:45 klimt kernel: Call Trace: Mar 30 13:21:45 klimt kernel: ? svc_rdma_vec_to_sg+0x7f/0x7f [rpcrdma] Mar 30 13:21:45 klimt kernel: svc_rdma_send_write_chunk+0x59/0xce [rpcrdma] Mar 30 13:21:45 klimt kernel: svc_rdma_sendto+0xf9/0x3ae [rpcrdma] Mar 30 13:21:45 klimt kernel: ? nfsd_destroy+0x51/0x51 [nfsd] Mar 30 13:21:45 klimt kernel: svc_send+0x105/0x1e3 [sunrpc] Mar 30 13:21:45 klimt kernel: nfsd+0xf2/0x149 [nfsd] Mar 30 13:21:45 klimt kernel: kthread+0xf6/0xfb Mar 30 13:21:45 klimt kernel: ? kthread_queue_delayed_work+0x74/0x74 Mar 30 13:21:45 klimt kernel: ret_from_fork+0x3a/0x50 Mar 30 13:21:45 klimt kernel: Modules linked in: ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue ib_umad ib_ipoib mlx4_ib sb_edac x86_pkg_temp_thermal iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel glue_helper crypto_simd cryptd pcspkr rpcrdma i2c_i801 rdma_ucm lpc_ich mfd_core ib_iser rdma_cm iw_cm ib_cm mei_me raid0 libiscsi mei sg scsi_transport_iscsi ioatdma wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod sr_mod cdrom mlx4_core crc32c_intel igb nvme i2c_algo_bit ahci i2c_core libahci nvme_core dca libata t10_pi qedr dm_mirror dm_region_hash dm_log dm_mod dax qede qed crc8 ib_uverbs ib_core Mar 30 13:21:45 klimt kernel: CR2: ffff8887e6c279a8 Mar 30 13:21:45 klimt kernel: ---[ end trace 87971d2ad3429424 ]--- It's absolutely not safe to use resources pointed to by the @send_wr argument of ib_post_send() _after_ that function returns. Those resources are typically freed by the Send completion handler, which can run before ib_post_send() returns. Thus the trace points currently around ib_post_send() in the server's RPC/RDMA transport are a hazard, even when they are disabled. Rearrange them so that they touch the Work Request only _before_ ib_post_send() is invoked. Fixes: bd2abef33394 ("svcrdma: Trace key RDMA API events") Fixes: 4201c7464753 ("svcrdma: Introduce svc_rdma_send_ctxt") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-04-17blk-wbt: Drop needless newlines from tracepoint format stringsTommi Rantala1-4/+4
Drop needless newlines from tracepoint format strings, they only add empty lines to perf tracing output. Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-07Merge branch 'akpm' (patches from Andrew)Linus Torvalds3-1/+3
Merge more updates from Andrew Morton: - a lot more of MM, quite a bit more yet to come: (memcg, pagemap, vmalloc, pagealloc, migration, thp, ksm, madvise, virtio, userfaultfd, memory-hotplug, shmem, rmap, zswap, zsmalloc, cleanups) - various other subsystems (procfs, misc, MAINTAINERS, bitops, lib, checkpatch, epoll, binfmt, kallsyms, reiserfs, kmod, gcov, kconfig, ubsan, fault-injection, ipc) * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (158 commits) ipc/shm.c: make compat_ksys_shmctl() static ipc/mqueue.c: fix a brace coding style issue lib/Kconfig.debug: fix a typo "capabilitiy" -> "capability" ubsan: include bug type in report header kasan: unset panic_on_warn before calling panic() ubsan: check panic_on_warn drivers/misc/lkdtm/bugs.c: add arithmetic overflow and array bounds checks ubsan: split "bounds" checker from other options ubsan: add trap instrumentation option init/Kconfig: clean up ANON_INODES and old IO schedulers options kernel/gcov/fs.c: replace zero-length array with flexible-array member gcov: gcc_3_4: replace zero-length array with flexible-array member gcov: gcc_4_7: replace zero-length array with flexible-array member kernel/kmod.c: fix a typo "assuems" -> "assumes" reiserfs: clean up several indentation issues kallsyms: unexport kallsyms_lookup_name() and kallsyms_on_each_symbol() samples/hw_breakpoint: drop use of kallsyms_lookup_name() samples/hw_breakpoint: drop HW_BREAKPOINT_R when reporting writes fs/binfmt_elf.c: don't free interpreter's ELF pheaders on common path fs/binfmt_elf.c: allocate less for static executable ...
2020-04-07Merge tag 'nfs-for-5.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-90/+63
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - Fix a page leak in nfs_destroy_unlinked_subrequests() - Fix use-after-free issues in nfs_pageio_add_request() - Fix new mount code constant_table array definitions - finish_automount() requires us to hold 2 refs to the mount record Features: - Improve the accuracy of telldir/seekdir by using 64-bit cookies when possible. - Allow one RDMA active connection and several zombie connections to prevent blocking if the remote server is unresponsive. - Limit the size of the NFS access cache by default - Reduce the number of references to credentials that are taken by NFS - pNFS files and flexfiles drivers now support per-layout segment COMMIT lists. - Enable partial-file layout segments in the pNFS/flexfiles driver. - Add support for CB_RECALL_ANY to the pNFS flexfiles layout type - pNFS/flexfiles Report NFS4ERR_DELAY and NFS4ERR_GRACE errors from the DS using the layouterror mechanism. Bugfixes and cleanups: - SUNRPC: Fix krb5p regressions - Don't specify NFS version in "UDP not supported" error - nfsroot: set tcp as the default transport protocol - pnfs: Return valid stateids in nfs_layout_find_inode_by_stateid() - alloc_nfs_open_context() must use the file cred when available - Fix locking when dereferencing the delegation cred - Fix memory leaks in O_DIRECT when nfs_get_lock_context() fails - Various clean ups of the NFS O_DIRECT commit code - Clean up RDMA connect/disconnect - Replace zero-length arrays with C99-style flexible arrays" * tag 'nfs-for-5.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (86 commits) NFS: Clean up process of marking inode stale. SUNRPC: Don't start a timer on an already queued rpc task NFS/pnfs: Reference the layout cred in pnfs_prepare_layoutreturn() NFS/pnfs: Fix dereference of layout cred in pnfs_layoutcommit_inode() NFS: Beware when dereferencing the delegation cred NFS: Add a module parameter to set nfs_mountpoint_expiry_timeout NFS: finish_automount() requires us to hold 2 refs to the mount record NFS: Fix a few constant_table array definitions NFS: Try to join page groups before an O_DIRECT retransmission NFS: Refactor nfs_lock_and_join_requests() NFS: Reverse the submission order of requests in __nfs_pageio_add_request() NFS: Clean up nfs_lock_and_join_requests() NFS: Remove the redundant function nfs_pgio_has_mirroring() NFS: Fix memory leaks in nfs_pageio_stop_mirroring() NFS: Fix a request reference leak in nfs_direct_write_clear_reqs() NFS: Fix use-after-free issues in nfs_pageio_add_request() NFS: Fix races nfs_page_group_destroy() vs nfs_destroy_unlinked_subrequests() NFS: Fix a page leak in nfs_destroy_unlinked_subrequests() NFS: Remove unused FLUSH_SYNC support in nfs_initiate_pgio() pNFS/flexfiles: Specify the layout segment range in LAYOUTGET ...
2020-04-07Merge tag 'f2fs-for-5.7-rc1' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've mainly focused on fixing bugs and addressing issues in recently introduced compression support. Enhancement: - add zstd support, and set LZ4 by default - add ioctl() to show # of compressed blocks - show mount time in debugfs - replace rwsem with spinlock - avoid lock contention in DIO reads Some major bug fixes wrt compression: - compressed block count - memory access and leak - remove obsolete fields - flag controls Other bug fixes and clean ups: - fix overflow when handling .flags in inode_info - fix SPO issue during resize FS flow - fix compression with fsverity enabled - potential deadlock when writing compressed pages - show missing mount options" * tag 'f2fs-for-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (66 commits) f2fs: keep inline_data when compression conversion f2fs: fix to disable compression on directory f2fs: add missing CONFIG_F2FS_FS_COMPRESSION f2fs: switch discard_policy.timeout to bool type f2fs: fix to verify tpage before releasing in f2fs_free_dic() f2fs: show compression in statx f2fs: clean up dic->tpages assignment f2fs: compress: support zstd compress algorithm f2fs: compress: add .{init,destroy}_decompress_ctx callback f2fs: compress: fix to call missing destroy_compress_ctx() f2fs: change default compression algorithm f2fs: clean up {cic,dic}.ref handling f2fs: fix to use f2fs_readpage_limit() in f2fs_read_multi_pages() f2fs: xattr.h: Make stub helpers inline f2fs: fix to avoid double unlock f2fs: fix potential .flags overflow on 32bit architecture f2fs: fix NULL pointer dereference in f2fs_verity_work() f2fs: fix to clear PG_error if fsverity failed f2fs: don't call fscrypt_get_encryption_info() explicitly in f2fs_tmpfile() f2fs: don't trigger data flush in foreground operation ...
2020-04-07khugepaged: skip collapse if uffd-wp detectedPeter Xu1-0/+1
Don't collapse the huge PMD if there is any userfault write protected small PTEs. The problem is that the write protection is in small page granularity and there's no way to keep all these write protection information if the small pages are going to be merged into a huge PMD. The same thing needs to be considered for swap entries and migration entries. So do the check as well disregarding khugepaged_max_ptes_swap. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Bobby Powers <bobbypowers@gmail.com> Cc: Brian Geffon <bgeffon@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Denis Plotnikov <dplotnikov@virtuozzo.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Martin Cracauer <cracauer@cons.org> Cc: Marty McFadden <mcfadden8@llnl.gov> Cc: Maya Gokhale <gokhale2@llnl.gov> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Rik van Riel <riel@redhat.com> Cc: Shaohua Li <shli@fb.com> Link: http://lkml.kernel.org/r/20200220163112.11409-12-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-07mm: code cleanup for MADV_FREEHuang Ying1-1/+1
Some comments for MADV_FREE is revised and added to help people understand the MADV_FREE code, especially the page flag, PG_swapbacked. This makes page_is_file_cache() isn't consistent with its comments. So the function is renamed to page_is_file_lru() to make them consistent again. All these are put in one patch as one logical change. Suggested-by: David Hildenbrand <david@redhat.com> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Suggested-by: David Rientjes <rientjes@google.com> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@kernel.org> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@surriel.com> Link: http://lkml.kernel.org/r/20200317100342.2730705-1-ying.huang@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-07mm/vma: add missing VMA flag readable name for VM_SYNCAnshuman Khandual1-0/+1
Patch series "mm/vma: Use all available wrappers when possible", v2. Apart from adding a VMA flag readable name for trace purpose, this series does some open encoding replacements with availabe VMA specific wrappers. This skips VM_HUGETLB check in vma_migratable() as its already being done with another patch (https://patchwork.kernel.org/patch/11347831/) which is yet to be merged. This patch (of 4): This just adds the missing readable name for VM_SYNC. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nick Piggin <npiggin@gmail.com> Cc: Paul Burton <paulburton@kernel.org> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/1582520593-30704-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-05Merge tag 'trace-v5.7' of ↵Linus Torvalds1-0/+57
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "New tracing features: - The ring buffer is no longer disabled when reading the trace file. The trace_pipe file was made to be used for live tracing and reading as it acted like the normal producer/consumer. As the trace file would not consume the data, the easy way of handling it was to just disable writes to the ring buffer. This came to a surprise to the BPF folks who complained about lost events due to reading. This is no longer an issue. If someone wants to keep the old disabling there's a new option "pause-on-trace" that can be set. - New set_ftrace_notrace_pid file. PIDs in this file will not be traced by the function tracer. Similar to set_ftrace_pid, which makes the function tracer only trace those tasks with PIDs in the file, the set_ftrace_notrace_pid does the reverse. - New set_event_notrace_pid file. PIDs in this file will cause events not to be traced if triggered by a task with a matching PID. Similar to the set_event_pid file but will not be traced. Note, sched_waking and sched_switch events may still be traced if one of the tasks referenced by those events contains a PID that is allowed to be traced. Tracing related features: - New bootconfig option, that is attached to the initrd file. If bootconfig is on the command line, then the initrd file is searched looking for a bootconfig appended at the end. - New GPU tracepoint infrastructure to help the gfx drivers to get off debugfs (acked by Greg Kroah-Hartman) And other minor updates and fixes" * tag 'trace-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (27 commits) tracing: Do not allocate buffer in trace_find_next_entry() in atomic tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid selftests/ftrace: Add test to test new set_event_notrace_pid file selftests/ftrace: Add test to test new set_ftrace_notrace_pid file tracing: Create set_event_notrace_pid to not trace tasks ftrace: Create set_ftrace_notrace_pid to not trace tasks ftrace: Make function trace pid filtering a bit more exact ftrace/kprobe: Show the maxactive number on kprobe_events tracing: Have the document reflect that the trace file keeps tracing enabled ring-buffer/tracing: Have iterator acknowledge dropped events tracing: Do not disable tracing when reading the trace file ring-buffer: Do not disable recording when there is an iterator ring-buffer: Make resize disable per cpu buffer instead of total buffer ring-buffer: Optimize rb_iter_head_event() ring-buffer: Do not die if rb_iter_peek() fails more than thrice ring-buffer: Have rb_iter_head_event() handle concurrent writer ring-buffer: Add page_stamp to iterator for synchronization ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance() ring-buffer: Have ring_buffer_empty() not depend on tracing stopped tracing: Save off entry when peeking at next entry ...
2020-04-04Merge tag 'nfsd-5.7' of git://git.linux-nfs.org/projects/cel/cel-2.6Linus Torvalds3-37/+165
Pull nfsd updates from Chuck Lever: - Fix EXCHANGE_ID response when NFSD runs in a container - A battery of new static trace points - Socket transports now use bio_vec to send Replies - NFS/RDMA now supports filesystems with no .splice_read method - Favor memcpy() over DMA mapping for small RPC/RDMA Replies - Add pre-requisites for supporting multiple Write chunks - Numerous minor fixes and clean-ups [ Chuck is filling in for Bruce this time while he and his family settle into a new house ] * tag 'nfsd-5.7' of git://git.linux-nfs.org/projects/cel/cel-2.6: (39 commits) svcrdma: Fix leak of transport addresses SUNRPC: Fix a potential buffer overflow in 'svc_print_xprts()' SUNRPC/cache: don't allow invalid entries to be flushed nfsd: fsnotify on rmdir under nfsd/clients/ nfsd4: kill warnings on testing stateids with mismatched clientids nfsd: remove read permission bit for ctl sysctl NFSD: Fix NFS server build errors sunrpc: Add tracing for cache events SUNRPC/cache: Allow garbage collection of invalid cache entries nfsd: export upcalls must not return ESTALE when mountd is down nfsd: Add tracepoints for update of the expkey and export cache entries nfsd: Add tracepoints for exp_find_key() and exp_get_by_name() nfsd: Add tracing to nfsd_set_fh_dentry() nfsd: Don't add locks to closed or closing open stateids SUNRPC: Teach server to use xprt_sock_sendmsg for socket sends SUNRPC: Refactor xs_sendpages() svcrdma: Avoid DMA mapping small RPC Replies svcrdma: Fix double sync of transport header buffer svcrdma: Refactor chunk list encoders SUNRPC: Add encoders for list item discriminators ...
2020-04-03f2fs: compress: support zstd compress algorithmChao Yu1-1/+2
Add zstd compress algorithm support, use "compress_algorithm=zstd" mountoption to enable it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-04-02Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2-3/+47
Pull SCSI updates from James Bottomley: "This series has a huge amount of churn because it pulls in Mauro's doc update changing all our txt files to rst ones. Excluding that, we have the usual driver updates (qla2xxx, ufs, lpfc, zfcp, ibmvfc, pm80xx, aacraid), a treewide update for scnprintf and some other minor updates. The major core change is Hannes moving functions out of the aacraid driver and into the core" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (223 commits) scsi: aic7xxx: aic97xx: Remove FreeBSD-specific code scsi: ufs: Do not rely on prefetched data scsi: dc395x: remove dc395x_bios_param scsi: libiscsi: Fix error count for active session scsi: hpsa: correct race condition in offload enabled scsi: message: fusion: Replace zero-length array with flexible-array member scsi: qedi: Add PCI shutdown handler support scsi: qedi: Add MFW error recovery process scsi: ufs: Enable block layer runtime PM for well-known logical units scsi: ufs-qcom: Override devfreq parameters scsi: ufshcd: Let vendor override devfreq parameters scsi: ufshcd: Update the set frequency to devfreq scsi: ufs: Resume ufs host before accessing ufs device scsi: ufs-mediatek: customize the delay for enabling host scsi: ufs: make HCE polling more compact to improve initialization latency scsi: ufs: allow custom delay prior to host enabling scsi: ufs-mediatek: use common delay function scsi: ufs: introduce common and flexible delay function scsi: ufs: use an enum for host capabilities scsi: ufs: fix uninitialized tx_lanes in ufshcd_disable_tx_lcc() ...
2020-04-02mm: mmap: add trace point of vm_unmapped_areaJaewon Kim1-0/+48
Even on 64 bit kernel, the mmap failure can happen for a 32 bit task. Virtual memory space shortage of a task on mmap is reported to userspace as -ENOMEM. It can be confused as physical memory shortage of overall system. The vm_unmapped_area can be called to by some drivers or other kernel core system like filesystem. In my platform, GPU driver calls to vm_unmapped_area and the driver returns -ENOMEM even in GPU side shortage. It can be hard to distinguish which code layer returns the -ENOMEM. Create mmap trace file and add trace point of vm_unmapped_area. i.e.) 277.156599: vm_unmapped_area: addr=77e0d03000 err=0 total_vm=0x17014b flags=0x1 len=0x400000 lo=0x8000 hi=0x7878c27000 mask=0x0 ofs=0x1 342.838740: vm_unmapped_area: addr=0 err=-12 total_vm=0xffb08 flags=0x0 len=0x100000 lo=0x40000000 hi=0xfffff000 mask=0x0 ofs=0x22 [akpm@linux-foundation.org: prefix address printk with 0x, per Matthew] Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michel Lespinasse <walken@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20200320055823.27089-3-jaewon31.kim@samsung.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-03-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds1-7/+11
Pull networking updates from David Miller: "Highlights: 1) Fix the iwlwifi regression, from Johannes Berg. 2) Support BSS coloring and 802.11 encapsulation offloading in hardware, from John Crispin. 3) Fix some potential Spectre issues in qtnfmac, from Sergey Matyukevich. 4) Add TTL decrement action to openvswitch, from Matteo Croce. 5) Allow paralleization through flow_action setup by not taking the RTNL mutex, from Vlad Buslov. 6) A lot of zero-length array to flexible-array conversions, from Gustavo A. R. Silva. 7) Align XDP statistics names across several drivers for consistency, from Lorenzo Bianconi. 8) Add various pieces of infrastructure for offloading conntrack, and make use of it in mlx5 driver, from Paul Blakey. 9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki. 10) Lots of parallelization improvements during configuration changes in mlxsw driver, from Ido Schimmel. 11) Add support to devlink for generic packet traps, which report packets dropped during ACL processing. And use them in mlxsw driver. From Jiri Pirko. 12) Support bcmgenet on ACPI, from Jeremy Linton. 13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei Starovoitov, and your's truly. 14) Support XDP meta-data in virtio_net, from Yuya Kusakabe. 15) Fix sysfs permissions when network devices change namespaces, from Christian Brauner. 16) Add a flags element to ethtool_ops so that drivers can more simply indicate which coalescing parameters they actually support, and therefore the generic layer can validate the user's ethtool request. Use this in all drivers, from Jakub Kicinski. 17) Offload FIFO qdisc in mlxsw, from Petr Machata. 18) Support UDP sockets in sockmap, from Lorenz Bauer. 19) Fix stretch ACK bugs in several TCP congestion control modules, from Pengcheng Yang. 20) Support virtual functiosn in octeontx2 driver, from Tomasz Duszynski. 21) Add region operations for devlink and use it in ice driver to dump NVM contents, from Jacob Keller. 22) Add support for hw offload of MACSEC, from Antoine Tenart. 23) Add support for BPF programs that can be attached to LSM hooks, from KP Singh. 24) Support for multiple paths, path managers, and counters in MPTCP. From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti, and others. 25) More progress on adding the netlink interface to ethtool, from Michal Kubecek" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits) net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline cxgb4/chcr: nic-tls stats in ethtool net: dsa: fix oops while probing Marvell DSA switches net/bpfilter: remove superfluous testing message net: macb: Fix handling of fixed-link node net: dsa: ksz: Select KSZ protocol tag netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write net: stmmac: add EHL 2.5Gbps PCI info and PCI ID net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID net: stmmac: create dwmac-intel.c to contain all Intel platform net: dsa: bcm_sf2: Support specifying VLAN tag egress rule net: dsa: bcm_sf2: Add support for matching VLAN TCI net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT net: dsa: bcm_sf2: Disable learning for ASP port net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge net: dsa: b53: Prevent tagged VLAN on port 7 for 7278 net: dsa: b53: Restore VLAN entries upon (re)configuration net: dsa: bcm_sf2: Fix overflow checks hv_netvsc: Remove unnecessary round_up for recv_completion_cnt ...
2020-03-31Merge tag 'for-5.7-tag' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "A number of core changes that make things work better in general, code is simpler and cleaner. Core changes: - per-inode file extent tree, for in memory tracking of contiguous extent ranges to make sure i_size adjustments are accurate - tree root structures are protected by reference counts, replacing SRCU that did not cover some cases - leak detector for tree root structures - per-transaction pinned extent tracking - buffer heads are replaced by bios for super block access - speedup of extent back reference resolution, on an example test scenario the runtime of send went down from a hour to minutes - factor out locking scheme used for subvolume writer and NOCOW exclusion, abstracted as DREW lock, double reader-writer exclusion (allow either readers or writers) - cleanup and abstract extent allocation policies, preparation for zoned device support - make reflink/clone_range work on inline extents - add more cancellation point for relocation, improves long response from 'balance cancel' - add page migration callback for data pages - switch to guid for uuids, with additional cleanups of the interface - make ranged full fsyncs more efficient - removal of obsolete ioctl flag BTRFS_SUBVOL_CREATE_ASYNC - remove b-tree readahead from delayed refs paths, avoiding seek and read unnecessary blocks Features: - v2 of ioctl to delete subvolumes, allowing to delete by id and more future extensions Fixes: - fix qgroup rescan worker that could block umount - fix crash during unmount due to race with delayed inode workers - fix dellaloc flushing logic that could create unnecessary chunks under heavy load - fix missing file extent item for hole after ranged fsync - several fixes in relocation error handling Other: - more documentation of relocation, device replace, space reservations - many random cleanups" * tag 'for-5.7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (210 commits) btrfs: fix missing semaphore unlock in btrfs_sync_file btrfs: use nofs allocations for running delayed items btrfs: sysfs: Use scnprintf() instead of snprintf() btrfs: do not resolve backrefs for roots that are being deleted btrfs: track reloc roots based on their commit root bytenr btrfs: restart relocate_tree_blocks properly btrfs: reloc: reorder reservation before root selection btrfs: do not readahead in build_backref_tree btrfs: do not use readahead for running delayed refs btrfs: Remove async_transid from btrfs_mksubvol/create_subvol/create_snapshot btrfs: Remove transid argument from btrfs_ioctl_snap_create_transid btrfs: Remove BTRFS_SUBVOL_CREATE_ASYNC support btrfs: kill the subvol_srcu btrfs: make btrfs_cleanup_fs_roots use the radix tree lock btrfs: don't take an extra root ref at allocation time btrfs: hold a ref on the root on the dead roots list btrfs: make inodes hold a ref on their roots btrfs: move the root freeing stuff into btrfs_put_root btrfs: move ino_cache_inode dropping out of btrfs_free_fs_root btrfs: make the extent buffer leak check per fs info ...
2020-03-30Merge branch 'sched-core-for-linus' of ↵Linus Torvalds1-22/+31
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle are: - Various NUMA scheduling updates: harmonize the load-balancer and NUMA placement logic to not work against each other. The intended result is better locality, better utilization and fewer migrations. - Introduce Thermal Pressure tracking and optimizations, to improve task placement on thermally overloaded systems. - Implement frequency invariant scheduler accounting on (some) x86 CPUs. This is done by observing and sampling the 'recent' CPU frequency average at ~tick boundaries. The CPU provides this data via the APERF/MPERF MSRs. This hopefully makes our capacity estimates more precise and keeps tasks on the same CPU better even if it might seem overloaded at a lower momentary frequency. (As usual, turbo mode is a complication that we resolve by observing the maximum frequency and renormalizing to it.) - Add asymmetric CPU capacity wakeup scan to improve capacity utilization on asymmetric topologies. (big.LITTLE systems) - PSI fixes and optimizations. - RT scheduling capacity awareness fixes & improvements. - Optimize the CONFIG_RT_GROUP_SCHED constraints code. - Misc fixes, cleanups and optimizations - see the changelog for details" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits) threads: Update PID limit comment according to futex UAPI change sched/fair: Fix condition of avg_load calculation sched/rt: cpupri_find: Trigger a full search as fallback kthread: Do not preempt current task if it is going to call schedule() sched/fair: Improve spreading of utilization sched: Avoid scale real weight down to zero psi: Move PF_MEMSTALL out of task->flags MAINTAINERS: Add maintenance information for psi psi: Optimize switching tasks inside shared cgroups psi: Fix cpu.pressure for cpu.max and competing cgroups sched/core: Distribute tasks within affinity masks sched/fair: Fix enqueue_task_fair warning thermal/cpu-cooling, sched/core: Move the arch_set_thermal_pressure() API to generic scheduler code sched/rt: Remove unnecessary push for unfit tasks sched/rt: Allow pulling unfitting task sched/rt: Optimize cpupri_find() on non-heterogenous systems sched/rt: Re-instate old behavior in select_task_rq_rt() sched/rt: cpupri_find: Implement fallback mechanism for !fit case sched/fair: Fix reordering of enqueue/dequeue_task_fair() sched/fair: Fix runnable_avg for throttled cfs ...
2020-03-30Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds1-0/+29
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main changes in this cycle were: - Make kfree_rcu() use kfree_bulk() for added performance - RCU updates - Callback-overload handling updates - Tasks-RCU KCSAN and sparse updates - Locking torture test and RCU torture test updates - Documentation updates - Miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits) rcu: Make rcu_barrier() account for offline no-CBs CPUs rcu: Mark rcu_state.gp_seq to detect concurrent writes Documentation/memory-barriers: Fix typos doc: Add rcutorture scripting to torture.txt doc/RCU/rcu: Use https instead of http if possible doc/RCU/rcu: Use absolute paths for non-rst files doc/RCU/rcu: Use ':ref:' for links to other docs doc/RCU/listRCU: Update example function name doc/RCU/listRCU: Fix typos in a example code snippets doc/RCU/Design: Remove remaining HTML tags in ReST files doc: Add some more RCU list patterns in the kernel rcutorture: Set KCSAN Kconfig options to detect more data races rcutorture: Manually clean up after rcu_barrier() failure rcutorture: Make rcu_torture_barrier_cbs() post from corresponding CPU rcuperf: Measure memory footprint during kfree_rcu() test rcutorture: Annotation lockless accesses to rcu_torture_current rcutorture: Add READ_ONCE() to rcu_torture_count and rcu_torture_batch rcutorture: Fix stray access to rcu_fwd_cb_nodelay rcutorture: Fix rcu_torture_one_read()/rcu_torture_writer() data race rcutorture: Make kvm-find-errors.sh abort on bad directory ...