summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2019-03-07Merge branch 'akpm' (patches from Andrew)Linus Torvalds148-658/+1100
Merge more updates from Andrew Morton: - some of the rest of MM - various misc things - dynamic-debug updates - checkpatch - some epoll speedups - autofs - rapidio - lib/, lib/lzo/ updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits) samples/mic/mpssd/mpssd.h: remove duplicate header kernel/fork.c: remove duplicated include include/linux/relay.h: fix percpu annotation in struct rchan arch/nios2/mm/fault.c: remove duplicate include unicore32: stop printing the virtual memory layout MAINTAINERS: fix GTA02 entry and mark as orphan mm: create the new vm_fault_t type arm, s390, unicore32: remove oneliner wrappers for memblock_alloc() arch: simplify several early memory allocations openrisc: simplify pte_alloc_one_kernel() sh: prefer memblock APIs returning virtual address microblaze: prefer memblock API returning virtual address powerpc: prefer memblock APIs returning virtual address lib/lzo: separate lzo-rle from lzo lib/lzo: implement run-length encoding lib/lzo: fast 8-byte copy on arm64 lib/lzo: 64-bit CTZ on arm64 lib/lzo: tidy-up ifdefs ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size ipc: annotate implicit fall through ...
2019-03-07samples/mic/mpssd/mpssd.h: remove duplicate headerBrajeswar Ghosh1-3/+0
Remove duplicate headers which are included more than once Link: http://lkml.kernel.org/r/20190114170033.GA3674@hp-pavilion-15-notebook-pc-brajeswar Signed-off-by: Brajeswar Ghosh <brajeswar.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07kernel/fork.c: remove duplicated includeYueHaibing1-1/+0
Remove duplicated include. Link: http://lkml.kernel.org/r/20181209062952.17736-1-yuehaibing@huawei.com Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07include/linux/relay.h: fix percpu annotation in struct rchanLuc Van Oostenryck1-1/+1
The percpu member of this structure is declared as: struct ... ** __percpu member; So its type is: __percpu pointer to pointer to struct ... But looking at how it's used, its type should be: pointer to __percpu pointer to struct ... and it should thus be declared as: struct ... * __percpu *member; So fix the placement of '__percpu' in the definition of this structures. This silents a few Sparse's warnings like: warning: incorrect type in initializer (different address spaces) expected void const [noderef] <asn:3> *__vpp_verify got struct sched_domain ** Link: http://lkml.kernel.org/r/20190118144902.79065-1-luc.vanoostenryck@gmail.com Fixes: 017c59c042d01 ("relay: Use per CPU constructs for the relay channel buffer pointers") Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Cc: Jens Axboe <axboe@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07arch/nios2/mm/fault.c: remove duplicate includeSabyasachi Gupta1-1/+0
Remove linux/ptrace.h which is included more than once Link: http://lkml.kernel.org/r/5c45d345.1c69fb81.d90ed.8e05@mx.google.com Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com> Cc: Ley Foon Tan <lftan@altera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07unicore32: stop printing the virtual memory layoutGeert Uytterhoeven1-24/+0
Since commit ad67b74d2469 ("printk: hash addresses printed with %p"), the virtual memory layout printed during boot up contains "ptrval" instead of actual addresses. Instead of changing the printing to "%px", and leaking virtual memory layout information again, just remove the printing completely, cfr. e.g. commits 071929dbdd86 ("arm64: Stop printing the virtual memory layout") and 31833332f798 ("m68k/mm: Stop printing the virtual memory layout"). All interesting information (actual section sizes) is already printed by mem_init_print_info() just above anyway. Link: http://lkml.kernel.org/r/20190121152254.29079-1-geert+renesas@glider.be Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07MAINTAINERS: fix GTA02 entry and mark as orphanJann Horn1-2/+3
The entry for GTA02 never had paths listed; fix that. commit 9d76295ac608 ("[ARM] GTA02/FreeRunner: Add machine definition"), which added the entry for GTA02, created two new files named arch/arm/mach-s3c2442/{include/mach/gta02.h,mach-gta02.c}, which were then renamed in commit dd6f01b5ccba ("ARM: S3C2440: move mach-s3c2440/* into mach-s3c24xx/") to arch/arm/mach-s3c24xx/{include/mach/gta02.h,mach-gta02.c}. Also, the GTA02 maintainer's email address is from a domain that doesn't have an MX record anymore and appears to have expired. Remove the maintainer and mark the subsystem as orphan. Link: http://lkml.kernel.org/r/20190215140444.37060-1-jannh@google.com Signed-off-by: Jann Horn <jannh@google.com> Cc: Nelson Castillo <arhuaco@freaks-unidos.net> Cc: Nelson Castillo <nelsoneci@gmail.com> Cc: Andy Green <andy@warmcat.com> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07mm: create the new vm_fault_t typeSouptick Joarder3-48/+73
Page fault handlers are supposed to return VM_FAULT codes, but some drivers/file systems mistakenly return error numbers. Now that all drivers/file systems have been converted to use the vm_fault_t return type, change the type definition to no longer be compatible with 'int'. By making it an unsigned int, the function prototype becomes incompatible with a function which returns int. Sparse will detect any attempts to return a value which is not a VM_FAULT code. VM_FAULT_SET_HINDEX and VM_FAULT_GET_HINDEX values are changed to avoid conflict with other VM_FAULT codes. [jrdr.linux@gmail.com: fix warnings] Link: http://lkml.kernel.org/r/20190109183742.GA24326@jordon-HP-15-Notebook-PC Link: http://lkml.kernel.org/r/20190108183041.GA12137@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()Mike Rapoport3-25/+8
arm, s390 and unicore32 use oneliner wrappers for memblock_alloc(). Replace their usage with direct call to memblock_alloc(). Link: http://lkml.kernel.org/r/1546248566-14910-7-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07arch: simplify several early memory allocationsMike Rapoport10-43/+18
There are several early memory allocations in arch/ code that use memblock_phys_alloc() to allocate memory, convert the returned physical address to the virtual address and then set the allocated memory to zero. Exactly the same behaviour can be achieved simply by calling memblock_alloc(): it allocates the memory in the same way as memblock_phys_alloc(), then it performs the phys_to_virt() conversion and clears the allocated memory. Replace the longer sequence with a simpler call to memblock_alloc(). Link: http://lkml.kernel.org/r/1546248566-14910-6-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07openrisc: simplify pte_alloc_one_kernel()Mike Rapoport1-7/+4
The pte_alloc_one_kernel() function allocates a page using __get_free_page(GFP_KERNEL) when mm initialization is complete and memblock_phys_alloc() on the earlier stages. The physical address of the page allocated with memblock_phys_alloc() is converted to the virtual address and in the both cases the allocated page is cleared using clear_page(). The code is simplified by replacing __get_free_page() with get_zeroed_page() and by replacing memblock_phys_alloc() with memblock_alloc(). Link: http://lkml.kernel.org/r/1546248566-14910-5-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Stafford Horne <shorne@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07sh: prefer memblock APIs returning virtual addressMike Rapoport2-16/+7
Rather than use the memblock_alloc_base that returns a physical address and then convert this address to the virtual one, use appropriate memblock function that returns a virtual address. There is a small functional change in the allocation of then NODE_DATA(). Instead of panicing if the local allocation failed, the non-local allocation attempt will be made. Link: http://lkml.kernel.org/r/1546248566-14910-4-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07microblaze: prefer memblock API returning virtual addressMike Rapoport1-2/+3
Rather than use the memblock_alloc_base that returns a physical address and then convert this address to the virtual one, use appropriate memblock function that returns a virtual address. Link: http://lkml.kernel.org/r/1546248566-14910-3-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Tested-by: Michal Simek <michal.simek@xilinx.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07powerpc: prefer memblock APIs returning virtual addressMike Rapoport9-51/+44
Patch series "memblock: simplify several early memory allocation", v4. These patches simplify some of the early memory allocations by replacing usage of older memblock APIs with newer and shinier ones. Quite a few places in the arch/ code allocated memory using a memblock API that returns a physical address of the allocated area, then converted this physical address to a virtual one and then used memset(0) to clear the allocated range. More recent memblock APIs do all the three steps in one call and their usage simplifies the code. It's important to note that regardless of API used, the core allocation is nearly identical for any set of memblock allocators: first it tries to find a free memory with all the constraints specified by the caller and then falls back to the allocation with some or all constraints disabled. The first three patches perform the conversion of call sites that have exact requirements for the node and the possible memory range. The fourth patch is a bit one-off as it simplifies openrisc's implementation of pte_alloc_one_kernel(), and not only the memblock usage. The fifth patch takes care of simpler cases when the allocation can be satisfied with a simple call to memblock_alloc(). The sixth patch removes one-liner wrappers for memblock_alloc on arm and unicore32, as suggested by Christoph. This patch (of 6): There are a several places that allocate memory using memblock APIs that return a physical address, convert the returned address to the virtual address and frequently also memset(0) the allocated range. Update these places to use memblock allocators already returning a virtual address. Use memblock functions that clear the allocated memory instead of calling memset(0) where appropriate. The calls to memblock_alloc_base() that were not followed by memset(0) are replaced with memblock_alloc_try_nid_raw(). Since the latter does not panic() when the allocation fails, the appropriate panic() calls are added to the call sites. Link: http://lkml.kernel.org/r/1546248566-14910-2-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Greentime Hu <green.hu@gmail.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Mark Salter <msalter@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Christoph Hellwig <hch@infradead.org> Cc: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07lib/lzo: separate lzo-rle from lzoDave Rodgman8-17/+226
To prevent any issues with persistent data, separate lzo-rle from lzo so that it is treated as a separate algorithm, and lzo is still available. Link: http://lkml.kernel.org/r/20190205155944.16007-3-dave.rodgman@arm.com Signed-off-by: Dave Rodgman <dave.rodgman@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com> Cc: Matt Sealey <matt.sealey@arm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <nitingupta910@gmail.com> Cc: Richard Purdie <rpurdie@openedhand.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Sonny Rao <sonnyrao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07lib/lzo: implement run-length encodingDave Rodgman5-43/+181
Patch series "lib/lzo: run-length encoding support", v5. Following on from the previous lzo-rle patchset: https://lkml.org/lkml/2018/11/30/972 This patchset contains only the RLE patches, and should be applied on top of the non-RLE patches ( https://lkml.org/lkml/2019/2/5/366 ). Previously, some questions were raised around the RLE patches. I've done some additional benchmarking to answer these questions. In short: - RLE offers significant additional performance (data-dependent) - I didn't measure any regressions that were clearly outside the noise One concern with this patchset was around performance - specifically, measuring RLE impact separately from Matt Sealey's patches (CTZ & fast copy). I have done some additional benchmarking which I hope clarifies the benefits of each part of the patchset. Firstly, I've captured some memory via /dev/fmem from a Chromebook with many tabs open which is starting to swap, and then split this into 4178 4k pages. I've excluded the all-zero pages (as zram does), and also the no-zero pages (which won't tell us anything about RLE performance). This should give a realistic test dataset for zram. What I found was that the data is VERY bimodal: 44% of pages in this dataset contain 5% or fewer zeros, and 44% contain over 90% zeros (30% if you include the no-zero pages). This supports the idea of special-casing zeros in zram. Next, I've benchmarked four variants of lzo on these pages (on 64-bit Arm at max frequency): baseline LZO; baseline + Matt Sealey's patches (aka MS); baseline + RLE only; baseline + MS + RLE. Numbers are for weighted roundtrip throughput (the weighting reflects that zram does more compression than decompression). https://drive.google.com/file/d/1VLtLjRVxgUNuWFOxaGPwJYhl_hMQXpHe/view?usp=sharing Matt's patches help in all cases for Arm (and no effect on Intel), as expected. RLE also behaves as expected: with few zeros present, it makes no difference; above ~75%, it gives a good improvement (50 - 300 MB/s on top of the benefit from Matt's patches). Best performance is seen with both MS and RLE patches. Finally, I have benchmarked the same dataset on an x86-64 device. Here, the MS patches make no difference (as expected); RLE helps, similarly as on Arm. There were no definite regressions; allowing for observational error, 0.1% (3/4178) of cases had a regression > 1 standard deviation, of which the largest was 4.6% (1.2 standard deviations). I think this is probably within the noise. https://drive.google.com/file/d/1xCUVwmiGD0heEMx5gcVEmLBI4eLaageV/view?usp=sharing One point to note is that the graphs show RLE appears to help very slightly with no zeros present! This is because the extra code causes the clang optimiser to change code layout in a way that happens to have a significant benefit. Taking baseline LZO and adding a do-nothing line like "__builtin_prefetch(out_len);" immediately before the "goto next" has the same effect. So this is a real, but basically spurious effect - it's small enough not to upset the overall findings. This patch (of 3): When using zram, we frequently encounter long runs of zero bytes. This adds a special case which identifies runs of zeros and encodes them using run-length encoding. This is faster for both compression and decompresion. For high-entropy data which doesn't hit this case, impact is minimal. Compression ratio is within a few percent in all cases. This modifies the bitstream in a way which is backwards compatible (i.e., we can decompress old bitstreams, but old versions of lzo cannot decompress new bitstreams). Link: http://lkml.kernel.org/r/20190205155944.16007-2-dave.rodgman@arm.com Signed-off-by: Dave Rodgman <dave.rodgman@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com> Cc: Matt Sealey <matt.sealey@arm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <nitingupta910@gmail.com> Cc: Richard Purdie <rpurdie@openedhand.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Sonny Rao <sonnyrao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07lib/lzo: fast 8-byte copy on arm64Matt Sealey1-1/+1
Enable faster 8-byte copies on arm64. Link: http://lkml.kernel.org/r/20181127161913.23863-6-dave.rodgman@arm.com Link: http://lkml.kernel.org/r/20190205141950.9058-4-dave.rodgman@arm.com Signed-off-by: Matt Sealey <matt.sealey@arm.com> Signed-off-by: Dave Rodgman <dave.rodgman@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <nitingupta910@gmail.com> Cc: Richard Purdie <rpurdie@openedhand.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Sonny Rao <sonnyrao@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07lib/lzo: 64-bit CTZ on arm64Matt Sealey1-1/+1
LZO leaves some performance on the table by not realising that arm64 can optimize count-trailing-zeros bit operations. Add CONFIG_ARM64 to the checked definitions alongside CONFIG_X86_64 to enable the use of rbit/clz instructions on full 64-bit quantities. Link: http://lkml.kernel.org/r/20181127161913.23863-5-dave.rodgman@arm.com Link: http://lkml.kernel.org/r/20190205141950.9058-3-dave.rodgman@arm.com Signed-off-by: Matt Sealey <matt.sealey@arm.com> Signed-off-by: Dave Rodgman <dave.rodgman@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <nitingupta910@gmail.com> Cc: Richard Purdie <rpurdie@openedhand.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Sonny Rao <sonnyrao@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07lib/lzo: tidy-up ifdefsDave Rodgman1-4/+4
Patch series "lib/lzo: performance improvements", v5. This patch (of 3): Modify the ifdefs in lzodefs.h to be more consistent with normal kernel macros (e.g., change __aarch64__ to CONFIG_ARM64). Link: http://lkml.kernel.org/r/20190205141950.9058-2-dave.rodgman@arm.com Signed-off-by: Dave Rodgman <dave.rodgman@arm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: David S. Miller <davem@davemloft.net> Cc: Nitin Gupta <nitingupta910@gmail.com> Cc: Richard Purdie <rpurdie@openedhand.com> Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Sonny Rao <sonnyrao@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Matt Sealey <matt.sealey@arm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_sizeGustavo A. R. Silva1-5/+1
Use kvzalloc() instead of kvmalloc() and memset(). Also, make use of the struct_size() helper instead of the open-coded version in order to avoid any potential type mistakes. This code was detected with the help of Coccinelle. Link: http://lkml.kernel.org/r/20190131214221.GA28930@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07ipc: annotate implicit fall throughMathieu Malaterre1-0/+1
There is a plan to build the kernel with -Wimplicit-fallthrough and this place in the code produced a warning (W=1). This commit remove the following warning: ipc/sem.c:1683:6: warning: this statement may fall through [-Wimplicit-fallthrough=] Link: http://lkml.kernel.org/r/20190114203608.18218-1-malat@debian.org Signed-off-by: Mathieu Malaterre <malat@debian.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07init/initramfs.c: provide more details in error messagesDavid Engraf1-3/+3
Use distinct error messages when archive decompression failed. Link: http://lkml.kernel.org/r/20190212075635.7373-1-david.engraf@sysgo.com Signed-off-by: David Engraf <david.engraf@sysgo.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07lib/ubsan: default UBSAN_ALIGNMENT to not setAnders Roxell1-5/+9
When booting an allmodconfig kernel, there are a lot of false-positives. With a message like this 'UBSAN: Undefined behaviour in...' with a call trace that follows. UBSAN warnings are a result of enabling noisy CONFIG_UBSAN_ALIGNMENT which is disabled by default if HAVE_EFFICIENT_UNALIGNED_ACCESS=y. It's noisy even if don't have efficient unaligned access, e.g. people often add __cacheline_aligned_in_smp in structs, but forget to align allocations of such struct (kmalloc() give 8-byte alignment in worst case). Rework so that when building a allmodconfig kernel that turns everything into '=m' or '=y' will turn off UBSAN_ALIGNMENT. [aryabinin@virtuozzo.com: changelog addition] Link: http://lkml.kernel.org/r/20181217150326.30933-1-anders.roxell@linaro.org Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Suggested-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07scripts/gdb: replace flags (MS_xyz -> SB_xyz)Jackie Liu2-12/+12
Since commit 1751e8a6cb93 ("Rename superblock flags (MS_xyz -> SB_xyz)"), scripts/gdb should be updated to replace MS_xyz with SB_xyz. This change didn't directly affect the running operation of scripts/gdb until commit e262e32d6bde "vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled" removed the definitions used by constants.py. Update constants.py.in to utilise the new internal flags, matching the implementation at fs/proc_namespace.c::show_sb_opts. Note to stable, e262e32d6bde landed in v5.0-rc1 (which was just released), so we'll want this picked back to 5.0 stable once this patch hits mainline (akpm just picked it up). Without this, debugging a kernel a kernel via GDB+QEMU is broken in the 5.0 release. [kieran.bingham@ideasonboard.com: add fixes tag, reword commit message] Link: http://lkml.kernel.org/r/20190305103014.25847-1-kieran.bingham@ideasonboard.com Fixes: e262e32d6bde "vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled" Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Dan Robertson <danlrobertson89@gmail.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07kcov: convert kcov.refcount to refcount_tElena Reshetova1-4/+5
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable kcov.refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. **Important note for maintainers: Some functions from refcount_t API defined in lib/refcount.c have different memory ordering guarantees than their atomic counterparts. The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon in state to be merged to the documentation tree. Normally the differences should not matter since refcount_t provides enough guarantees to satisfy the refcounting use cases, but in some rare cases it might matter. Please double check that you don't have some undocumented memory guarantees for this variable usage. For the kcov.refcount it might make a difference in following places: - kcov_put(): decrement in refcount_dec_and_test() only provides RELEASE ordering and control dependency on success vs. fully ordered atomic counterpart Link: http://lkml.kernel.org/r/1547634429-772-1-git-send-email-elena.reshetova@intel.com Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andrea Parri <andrea.parri@amarulasolutions.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07kcov: no need to check return value of debugfs_create functionsGreg Kroah-Hartman1-4/+2
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Link: http://lkml.kernel.org/r/20190122152151.16139-46-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07kernel/configs: use .incbin directive to embed config_data.gzMasahiro Yamada4-35/+21
This slightly optimizes the kernel/configs.c build. bin2c is not very efficient because it converts a data file into a huge array to embed it into a *.c file. Instead, we can use the .incbin directive. Also, this simplifies the code; Makefile is cleaner, and the way to get the offset/size of the config_data.gz is more straightforward. I used the "asm" statement in *.c instead of splitting it into *.S because MODULE_* tags are not supported in *.S files. I also cleaned up kernel/.gitignore; "config_data.gz" is unneeded because the top-level .gitignore takes care of the "*.gz" pattern. [yamada.masahiro@socionext.com: v2] Link: http://lkml.kernel.org/r/1550108893-21226-1-git-send-email-yamada.masahiro@socionext.com Link: http://lkml.kernel.org/r/1549941160-8084-1-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Alexander Popov <alex.popov@linux.com> Cc: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07configs: get rid of obsolete CONFIG_ENABLE_WARN_DEPRECATEDAlexey Brodkin56-56/+2
This Kconfig option was removed during v4.19 development in commit 771c035372a0 ("deprecate the '__deprecated' attribute warnings entirely and for good") so there's no point to keep it in defconfigs any longer. FWIW defconfigs were patched with: --------------------------->8---------------------- find . -name *_defconfig -exec sed -i '/CONFIG_ENABLE_WARN_DEPRECATED/d' {} \; --------------------------->8---------------------- Link: http://lkml.kernel.org/r/20190128152434.41969-1-abrodkin@synopsys.com Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07kernel/gcov/gcc_3_4.c: use struct_size() in kzalloc()Gustavo A. R. Silva1-4/+2
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void *entry[]; }; instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Link: http://lkml.kernel.org/r/20190109172445.GA15908@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07sysctl: handle overflow for file-maxChristian Brauner1-0/+3
Currently, when writing echo 18446744073709551616 > /proc/sys/fs/file-max /proc/sys/fs/file-max will overflow and be set to 0. That quickly crashes the system. This commit sets the max and min value for file-max. The max value is set to long int. Any higher value cannot currently be used as the percpu counters are long ints and not unsigned integers. Note that the file-max value is ultimately parsed via __do_proc_doulongvec_minmax(). This function does not report error when min or max are exceeded. Which means if a value largen that long int is written userspace will not receive an error instead the old value will be kept. There is an argument to be made that this should be changed and __do_proc_doulongvec_minmax() should return an error when a dedicated min or max value are exceeded. However this has the potential to break userspace so let's defer this to an RFC patch. Link: http://lkml.kernel.org/r/20190107222700.15954-3-christian@brauner.io Signed-off-by: Christian Brauner <christian@brauner.io> Acked-by: Kees Cook <keescook@chromium.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Waiman Long <longman@redhat.com> [christian@brauner.io: v4] Link: http://lkml.kernel.org/r/20190210203943.8227-3-christian@brauner.io Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07sysctl: handle overflow in proc_get_longChristian Brauner1-1/+39
proc_get_long() is a funny function. It uses simple_strtoul() and for a good reason. proc_get_long() wants to always succeed the parse and return the maybe incorrect value and the trailing characters to check against a pre-defined list of acceptable trailing values. However, simple_strtoul() explicitly ignores overflows which can cause funny things like the following to happen: echo 18446744073709551616 > /proc/sys/fs/file-max cat /proc/sys/fs/file-max 0 (Which will cause your system to silently die behind your back.) On the other hand kstrtoul() does do overflow detection but does not return the trailing characters, and also fails the parse when anything other than '\n' is a trailing character whereas proc_get_long() wants to be more lenient. Now, before adding another kstrtoul() function let's simply add a static parse strtoul_lenient() which: - fails on overflow with -ERANGE - returns the trailing characters to the caller The reason why we should fail on ERANGE is that we already do a partial fail on overflow right now. Namely, when the TMPBUFLEN is exceeded. So we already reject values such as 184467440737095516160 (21 chars) but accept values such as 18446744073709551616 (20 chars) but both are overflows. So we should just always reject 64bit overflows and not special-case this based on the number of chars. Link: http://lkml.kernel.org/r/20190107222700.15954-2-christian@brauner.io Signed-off-by: Christian Brauner <christian@brauner.io> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Waiman Long <longman@redhat.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07rapidio/mport_cdev: mark expected switch fall-throughGustavo A. R. Silva1-0/+1
In preparation for enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warning: drivers/rapidio/devices/rio_mport_cdev.c: In function `mport_release_mapping': drivers/rapidio/devices/rio_mport_cdev.c:2151:3: warning: this statement may fall through [-Wimplicit-fallthrough=] rio_unmap_inb_region(mport, map->phys_addr); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ CC drivers/regulator/fixed-helper.o CC drivers/pinctrl/stm32/pinctrl-stm32f429.o drivers/rapidio/devices/rio_mport_cdev.c:2152:2: note: here case MAP_DMA: ^~~~ Warning level 3 was used: -Wimplicit-fallthrough=3 This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough. Link: http://lkml.kernel.org/r/20190212175014.GA14326@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Alexandre Bounine <alex.bou9@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07drivers/rapidio/rio_cm.c: fix potential oops in riocm_ch_listen()Dan Carpenter1-1/+3
If riocm_get_channel() fails, then we should just return -EINVAL. Calling riocm_put_channel() will trigger a NULL dereference and generally we should call put() if the get() didn't succeed. Link: http://lkml.kernel.org/r/20190110130230.GB27017@kadam Fixes: b6e8d4aa1110 ("rapidio: add RapidIO channelized messaging driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Alexandre Bounine <alexandre.bounine@idt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07kernel: workqueue: clarify wq_worker_last_func() caller requirementsJohannes Weiner1-0/+10
This function can only be called safely from very specific scheduler contexts. Document those. Link: http://lkml.kernel.org/r/20190206150528.31198-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07exec: increase BINPRM_BUF_SIZE to 256Oleg Nesterov2-2/+2
Large enterprise clients often run applications out of networked file systems where the IT mandated layout of project volumes can end up leading to paths that are longer than 128 characters. Bumping this up to the next order of two solves this problem in all but the most egregious case while still fitting into a 512b slab. [oleg@redhat.com: update comment, per Kees] Link: http://lkml.kernel.org/r/20181112160956.GA28472@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Ben Woodard <woodard@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07fs/exec.c: replace opencoded set_mask_bits()Vineet Gupta1-6/+1
Link: http://lkml.kernel.org/r/1548275584-18096-2-git-send-email-vgupta@synopsys.com Link: http://lkml.kernel.org/g/20150807115710.GA16897@redhat.com Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07fat: enable .splice_write to support splice on O_DIRECT fileHou Tao1-0/+1
Now splice() on O_DIRECT-opened fat file will return -EFAULT, that is because the default .splice_write, namely default_file_splice_write(), will construct an ITER_KVEC iov_iter and dio_refill_pages() in dio path can not handle it. Fix it by implementing .splice_write through iter_file_splice_write(). Spotted by xfs-tests generic/091. Link: http://lkml.kernel.org/r/20190210094754.56355-1-houtao1@huawei.com Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07autofs: clear O_NONBLOCK on the pipeNeilBrown1-0/+2
autofs does not expect the pipe it is given to have O_NONBLOCK set - specifically if __kernel_write() in autofs_write() returns -EAGAIN, this is treated as a fatal error and the pipe is closed. For safety autofs should, therefore, clear the O_NONBLOCK flag. Releases of systemd prior to 8th February 2019 used pipe2(p, O_NONBLOCK|O_CLOEXEC) and thus (inadvertently) set this flag. Link: http://lkml.kernel.org/r/154993550902.3321.1183632970046073478.stgit@pluto-themaw-net Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07fs/autofs/inode.c: use seq_puts() for simple strings in autofs_show_options()Ian Kent1-6/+6
Fix checkpatch.sh WARNING about the use of seq_printf() to print simple strings in autofs_show_options(), use seq_puts() in this case. Link: http://lkml.kernel.org/r/154889012613.4863.12231175554744203482.stgit@pluto-themaw-net Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07autofs: add ignore mount optionIan Kent3-2/+10
Add an autofs file system mount option that can be used to provide a generic indicator to applications that the mount entry should be ignored when displaying mount information. In other OSes that provide autofs and that provide a mount list to user space based on the kernel mount list a no-op mount option ("ignore" is the one use on the most common OS) is allowed so that autofs file system users can optionally use it. The idea is that it be used by user space programs to exclude autofs mounts from consideration when reading the mounts list. Prior to the change to link /etc/mtab to /proc/self/mounts all I needed to do to achieve this was to use mount(2) and not update the mtab but now that no longer works. I know the symlinking happened a long time ago and I considered doing this then but, at the time I couldn't remember the commonly used option name and thought persuading the various utility maintainers would be too hard. But now I have a RHEL request to do this for compatibility for a widely used product so I want to go ahead with it and try and enlist the help of some utility package maintainers. Clearly, without the option nothing can be done so it's at least a start. Link: http://lkml.kernel.org/r/154725123970.11260.6113771566924907275.stgit@pluto-themaw-net Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07init/calibrate.c: provide proper prototypeValdis Kletnieks1-0/+1
Sparse issues a warning: CHECK init/calibrate.c init/calibrate.c:271:28: warning: symbol 'calibration_delay_done' was not declared. Should it be static? The actual issue is that it's a __weak symbol that archs can override (in fact, ARM does so), but no prototype is provided. Let's provide one to prevent surprises. Link: http://lkml.kernel.org/r/18827.1548750938@turing-police.cc.vt.edu Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07fs/binfmt_elf.c: spread const a littleAlexey Dobriyan1-5/+3
Link: http://lkml.kernel.org/r/20190204202830.GC27482@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07fs/binfmt_elf.c: use list_for_each_entry()Alexey Dobriyan1-10/+5
[adobriyan@gmail.com: fixup compilation] Link: http://lkml.kernel.org/r/20190205064334.GA2152@avx2 Link: http://lkml.kernel.org/r/20190204202800.GB27482@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07fs/binfmt_elf.c: don't be afraid of overflowAlexey Dobriyan1-6/+3
Number of ELF program headers is 16-bit by spec, so total size comfortably fits into "unsigned int". Space savings: 7 bytes! add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-7 (-7) Function old new delta load_elf_phdrs 137 130 -7 Link: http://lkml.kernel.org/r/20190204202715.GA27482@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07epoll: use rwlock in order to reduce ep_poll_callback() contentionRoman Penyaev1-36/+122
The goal of this patch is to reduce contention of ep_poll_callback() which can be called concurrently from different CPUs in case of high events rates and many fds per epoll. Problem can be very well reproduced by generating events (write to pipe or eventfd) from many threads, while consumer thread does polling. In other words this patch increases the bandwidth of events which can be delivered from sources to the poller by adding poll items in a lockless way to the list. The main change is in replacement of the spinlock with a rwlock, which is taken on read in ep_poll_callback(), and then by adding poll items to the tail of the list using xchg atomic instruction. Write lock is taken everywhere else in order to stop list modifications and guarantee that list updates are fully completed (I assume that write side of a rwlock does not starve, it seems qrwlock implementation has these guarantees). The following are some microbenchmark results based on the test [1] which starts threads which generate N events each. The test ends when all events are successfully fetched by the poller thread: spinlock ======== threads events/ms run-time ms 8 6402 12495 16 7045 22709 32 7395 43268 rwlock + xchg ============= threads events/ms run-time ms 8 10038 7969 16 12178 13138 32 13223 24199 According to the results bandwidth of delivered events is significantly increased, thus execution time is reduced. This patch was tested with different sort of microbenchmarks and artificial delays (e.g. "udelay(get_random_int() & 0xff)") introduced in kernel on paths where items are added to lists. [1] https://github.com/rouming/test-tools/blob/master/stress-epoll.c Link: http://lkml.kernel.org/r/20190103150104.17128-5-rpenyaev@suse.de Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07epoll: unify awaking of wakeup source on ep_poll_callback() pathRoman Penyaev1-8/+1
Original comment "Activate ep->ws since epi->ws may get deactivated at any time" indeed sounds loud, but it is incorrect, because the path where we check epi->ws is a path where insert to ovflist happens, i.e. ep_scan_ready_list() has taken ep->mtx and waits for this callback to finish, thus ep_modify() (which unregisters wakeup source) waits for ep_scan_ready_list(). Here in this patch I simply call ep_pm_stay_awake_rcu(), which is a bit extra for this path (indirectly protected by main ep->mtx, so even rcu is not needed), but I do not want to create another naked __ep_pm_stay_awake() variant only for this particular case, so rcu variant is just better for all the cases. Link: http://lkml.kernel.org/r/20190103150104.17128-4-rpenyaev@suse.de Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07epoll: make sure all elements in ready list are in FIFO orderRoman Penyaev1-1/+5
Patch series "use rwlock in order to reduce ep_poll_callback() contention", v3. The last patch targets the contention problem in ep_poll_callback(), which can be very well reproduced by generating events (write to pipe or eventfd) from many threads, while consumer thread does polling. The following are some microbenchmark results based on the test [1] which starts threads which generate N events each. The test ends when all events are successfully fetched by the poller thread: spinlock ======== threads events/ms run-time ms 8 6402 12495 16 7045 22709 32 7395 43268 rwlock + xchg ============= threads events/ms run-time ms 8 10038 7969 16 12178 13138 32 13223 24199 According to the results bandwidth of delivered events is significantly increased, thus execution time is reduced. This patch (of 4): All coming events are stored in FIFO order and this is also should be applicable to ->ovflist, which originally is stack, i.e. LIFO. Thus to keep correct FIFO order ->ovflist should reversed by adding elements to the head of the read list but not to the tail. Link: http://lkml.kernel.org/r/20190103150104.17128-2-rpenyaev@suse.de Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07checkpatch: add test for SPDX-License-Identifier on wrong line #Joe Perches1-0/+8
Warn when any SPDX-License-Identifier: tag is not created on the proper line number. Link: http://lkml.kernel.org/r/9b74ee87f8c1b8fd310e213fcb4994d58610fcb6.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: "Enrico Weigelt, metux IT consult" <lkml@metux.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07checkpatch: allow reporting C99 style commentsVadim Bendebury1-1/+2
Presently C99 style comments are removed unconditionally before actual patch validity check happens. This is a problem for some third party projects which use checkpatch.pl but do not allow C99 style comments. This patch adds yet another variable, named C99_COMMENT_TOLERANCE. If it is included in the --ignore command line or config file options list, C99 comments in the patch are reported as errors. Tested by processing a patch with a C99 style comment, it passes the check just fine unless '--ignore C99_COMMENT_TOLERANCE' is present in .checkpatch.conf. Link: http://lkml.kernel.org/r/20190110224957.25008-1-vbendeb@chromium.org Signed-off-by: Vadim Bendebury <vbendeb@chromium.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07checkpatch: add some new alloc functions to various testsJoe Perches1-4/+15
Many new generic allocation functions like the kvmalloc family have been added recently to the kernel. The allocation functions test now includes: o kvmalloc and variants o kstrdup_const o kmemdup_nul o dma_alloc_coherent o alloc_skb and variants Add a separate $allocFunctions variable to help make the allocation functions test a bit more readable. Miscellanea: o Use $allocFunctions in the unnecessary OOM message test and add exclude uses with __GFP_NOWARN o Use $allocFunctions in the unnecessary cast test o Add the kvmalloc family to the preferred sizeof alloc style foo = kvmalloc(sizeof(*foo), ...) Link: http://lkml.kernel.org/r/a5e60a2b93e10baf84af063f6c8e56402273105d.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>