summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-03-24irqchip/gic-v4.1: Plumb set_vcpu_affinity SGI callbacksMarc Zyngier2-0/+23
Just like for vLPIs, there is some configuration information that cannot be directly communicated through the normal irqchip API, and we have to use our good old friend set_vcpu_affinity as a side-band communication mechanism. This is used to configure group and priority for a given vSGI. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-13-maz@kernel.org
2020-03-24irqchip/gic-v4.1: Plumb get/set_irqchip_state SGI callbacksMarc Zyngier2-0/+91
To implement the get/set_irqchip_state callbacks (limited to the PENDING state), we have to use a particular set of hacks: - Reading the pending state is done by using a pair of new redistributor registers (GICR_VSGIR, GICR_VSGIPENDR), which allow the 16 interrupts state to be retrieved. - Setting the pending state is done by generating it as we'd otherwise do for a guest (writing to GITS_SGIR). - Clearing the pending state is done by emitting a VSGI command with the "clear" bit set. This requires some interesting locking though: - When talking to the redistributor, we must make sure that the VPE affinity doesn't change, hence taking the VPE lock. - At the same time, we must ensure that nobody accesses the same redistributor's GICR_VSGIR registers for a different VPE, which would corrupt the reading of the pending bits. We thus take the per-RD spinlock. Much fun. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-12-maz@kernel.org
2020-03-24irqchip/gic-v4.1: Plumb mask/unmask SGI callbacksMarc Zyngier1-0/+18
Implement mask/unmask for virtual SGIs by calling into the configuration helper. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-11-maz@kernel.org
2020-03-24irqchip/gic-v4.1: Add initial SGI configurationMarc Zyngier2-2/+93
The GICv4.1 ITS has yet another new command (VSGI) which allows a VPE-targeted SGI to be configured (or have its pending state cleared). Add support for this command and plumb it into the activate irqdomain callback so that it is ready to be used. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-10-maz@kernel.org
2020-03-24irqchip/gic-v4.1: Plumb skeletal VSGI irqchipMarc Zyngier3-4/+88
Since GICv4.1 has the capability to inject 16 SGIs into each VPE, and that I'm keen not to invent too many specific interfaces to manipulate these interrupts, let's pretend that each of these SGIs is an actual Linux interrupt. For that matter, let's introduce a minimal irqchip and irqdomain setup that will get fleshed up in the following patches. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-9-maz@kernel.org
2020-03-20irqchip/gic-v4.1: Map the ITS SGIR register pageMarc Zyngier1-2/+13
One of the new features of GICv4.1 is to allow virtual SGIs to be directly signaled to a VPE. For that, the ITS has grown a new 64kB page containing only a single register that is used to signal a SGI to a given VPE. Add a second mapping covering this new 64kB range, and take this opportunity to limit the original mapping to 64kB, which is enough to cover the span of the ITS registers. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-8-maz@kernel.org
2020-03-20irqchip/gic-v4.1: Advertise support v4.1 to KVMMarc Zyngier3-1/+12
Tell KVM that we support v4.1. Nothing uses this information so far. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-7-maz@kernel.org
2020-03-20irqchip/gic-v4.1: Ensure mutual exclusion betwen invalidations on the same RDMarc Zyngier3-0/+8
The GICv4.1 spec says that it is CONTRAINED UNPREDICTABLE to write to any of the GICR_INV{LPI,ALL}R registers if GICR_SYNCR.Busy == 1. To deal with it, we must ensure that only a single invalidation can happen at a time for a given redistributor. Add a per-RD lock to that effect and take it around the invalidation/syncr-read to deal with this. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-6-maz@kernel.org
2020-03-20irqchip/gic-v4.1: Wait for completion of redistributor's INVALL operationZenghui Yu1-0/+2
In GICv4.1, we emulate a guest-issued INVALL command by a direct write to GICR_INVALLR. Before we finish the emulation and go back to guest, let's make sure the physical invalidate operation is actually completed and no stale data will be left in redistributor. Per the specification, this can be achieved by polling the GICR_SYNCR.Busy bit (to zero). Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200302092145.899-1-yuzenghui@huawei.com Link: https://lore.kernel.org/r/20200304203330.4967-5-maz@kernel.org
2020-03-19irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD ↵Marc Zyngier2-8/+53
access Before GICv4.1, all operations would be serialized with the affinity changes by virtue of using the same ITS command queue. With v4.1, things change, as invalidations (and a number of other operations) are issued using the redistributor MMIO frame. We must thus make sure that these redistributor accesses cannot race against aginst the affinity change, or we may end-up talking to the wrong redistributor. To ensure this, we expand the irq_to_cpuid() helper to take a spinlock when the LPI is mapped to a vLPI (a new per-VPE lock) on each operation that requires mutual exclusion. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-4-maz@kernel.org
2020-03-19irqchip/gic-v4.1: Skip absent CPUs while iterating over redistributorsMarc Zyngier1-0/+4
In a system that is only sparsly populated with CPUs, we can end-up with redistributors structures that are not initialized. Let's make sure we don't try and access those when iterating over them (in this case when checking we have a L2 VPE table). Fixes: 4e6437f12d6e ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level") Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-3-maz@kernel.org
2020-03-19irqchip/gic-v3: Use SGIs without active state if offeredMarc Zyngier2-2/+10
To allow the direct injection of SGIs into a guest, the GICv4.1 architecture has to sacrifice the Active state so that SGIs look a lot like LPIs (they are injected by the same mechanism). In order not to break existing software, the architecture gives offers guests OSs the choice: SGIs with or without an active state. It is the hypervisors duty to honor the guest's choice. For this, the architecture offers a discovery bit indicating whether the GIC supports GICv4.1 SGIs (GICD_TYPER2.nASSGIcap), and another bit indicating whether the guest wants Active-less SGIs or not (controlled by GICD_CTLR.nASSGIreq). A hypervisor not supporting GICv4.1 SGIs would leave nASSGIcap clear, and a guest not knowing about GICv4.1 SGIs (or definitely wanting an Active state) would leave nASSGIreq clear (both being thankfully backward compatible with older revisions of the GIC). Since Linux is perfectly happy without an active state on SGIs, inform the hypervisor that we'll use that if offered. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-2-maz@kernel.org
2020-03-01Linux 5.6-rc4v5.6-rc4Linus Torvalds1-1/+1
2020-03-01Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds2-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Two more bug fixes (including a regression) for 5.6" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: potential crash on allocation error in ext4_alloc_flex_bg_array() jbd2: fix data races at struct journal_head
2020-03-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds22-107/+171
Pull KVM fixes from Paolo Bonzini: "More bugfixes, including a few remaining "make W=1" issues such as too large frame sizes on some configurations. On the ARM side, the compiler was messing up shadow stacks between EL1 and EL2 code, which is easily fixed with __always_inline" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: check descriptor table exits on instruction emulation kvm: x86: Limit the number of "kvm: disabled by bios" messages KVM: x86: avoid useless copy of cpufreq policy KVM: allow disabling -Werror KVM: x86: allow compiling as non-module with W=1 KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipis KVM: Introduce pv check helpers KVM: let declaration of kvm_get_running_vcpus match implementation KVM: SVM: allocate AVIC data structures based on kvm_amd module parameter arm64: Ask the compiler to __always_inline functions used by KVM at HYP KVM: arm64: Define our own swab32() to avoid a uapi static inline KVM: arm64: Ask the compiler to __always_inline functions used at HYP kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe() KVM: arm/arm64: Fix up includes for trace.h
2020-03-01KVM: VMX: check descriptor table exits on instruction emulationOliver Upton1-0/+15
KVM emulates UMIP on hardware that doesn't support it by setting the 'descriptor table exiting' VM-execution control and performing instruction emulation. When running nested, this emulation is broken as KVM refuses to emulate L2 instructions by default. Correct this regression by allowing the emulation of descriptor table instructions if L1 hasn't requested 'descriptor table exiting'. Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode") Reported-by: Jan Kiszka <jan.kiszka@web.de> Cc: stable@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-29Merge branch 'i2c/for-current-fixed' of ↵Linus Torvalds3-56/+34
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "I2C has three driver bugfixes for you. We agreed on the Mac regression to go in via I2C" * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: macintosh: therm_windtunnel: fix regression when instantiating devices i2c: altera: Fix potential integer overflow i2c: jz4780: silence log flood on txabrt
2020-02-29ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()Dan Carpenter1-3/+3
If sbi->s_flex_groups_allocated is zero and the first allocation fails then this code will crash. The problem is that "i--" will set "i" to -1 but when we compare "i >= sbi->s_flex_groups_allocated" then the -1 is type promoted to unsigned and becomes UINT_MAX. Since UINT_MAX is more than zero, the condition is true so we call kvfree(new_groups[-1]). The loop will carry on freeing invalid memory until it crashes. Fixes: 7c990728b99e ("ext4: fix potential race between s_flex_groups online resizing and access") Reviewed-by: Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20200228092142.7irbc44yaz3by7nb@kili.mountain Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29macintosh: therm_windtunnel: fix regression when instantiating devicesWolfram Sang1-21/+31
Removing attach_adapter from this driver caused a regression for at least some machines. Those machines had the sensors described in their DT, too, so they didn't need manual creation of the sensor devices. The old code worked, though, because manual creation came first. Creation of DT devices then failed later and caused error logs, but the sensors worked nonetheless because of the manually created devices. When removing attach_adaper, manual creation now comes later and loses the race. The sensor devices were already registered via DT, yet with another binding, so the driver could not be bound to it. This fix refactors the code to remove the race and only manually creates devices if there are no DT nodes present. Also, the DT binding is updated to match both, the DT and manually created devices. Because we don't know which device creation will be used at runtime, the code to start the kthread is moved to do_probe() which will be called by both methods. Fixes: 3e7bed52719d ("macintosh: therm_windtunnel: drop using attach_adapter") Link: https://bugzilla.kernel.org/show_bug.cgi?id=201723 Reported-by: Erhard Furtner <erhard_f@mailbox.org> Tested-by: Erhard Furtner <erhard_f@mailbox.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org # v4.19+
2020-02-29jbd2: fix data races at struct journal_headQian Cai1-4/+4
journal_head::b_transaction and journal_head::b_next_transaction could be accessed concurrently as noticed by KCSAN, LTP: starting fsync04 /dev/zero: Can't open blockdev EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) ================================================================== BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2] write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70: __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2] __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569 jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2] (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034 kjournald2+0x13b/0x450 [jbd2] kthread+0x1cd/0x1f0 ret_from_fork+0x27/0x50 read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68: jbd2_write_access_granted+0x1b2/0x250 [jbd2] jbd2_write_access_granted at fs/jbd2/transaction.c:1155 jbd2_journal_get_write_access+0x2c/0x60 [jbd2] __ext4_journal_get_write_access+0x50/0x90 [ext4] ext4_mb_mark_diskspace_used+0x158/0x620 [ext4] ext4_mb_new_blocks+0x54f/0xca0 [ext4] ext4_ind_map_blocks+0xc79/0x1b40 [ext4] ext4_map_blocks+0x3b4/0x950 [ext4] _ext4_get_block+0xfc/0x270 [ext4] ext4_get_block+0x3b/0x50 [ext4] __block_write_begin_int+0x22e/0xae0 __block_write_begin+0x39/0x50 ext4_write_begin+0x388/0xb50 [ext4] generic_perform_write+0x15d/0x290 ext4_buffered_write_iter+0x11f/0x210 [ext4] ext4_file_write_iter+0xce/0x9e0 [ext4] new_sync_write+0x29c/0x3b0 __vfs_write+0x92/0xa0 vfs_write+0x103/0x260 ksys_write+0x9d/0x130 __x64_sys_write+0x4c/0x60 do_syscall_64+0x91/0xb05 entry_SYSCALL_64_after_hwframe+0x49/0xbe 5 locks held by fsync04/25724: #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260 #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4] #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2] #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4] #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2] irq event stamp: 1407125 hardirqs last enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790 hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790 softirqs last enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0 Reported by Kernel Concurrency Sanitizer on: CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 The plain reads are outside of jh->b_state_lock critical section which result in data races. Fix them by adding pairs of READ|WRITE_ONCE(). Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pw Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29Merge tag 'scsi-fixes' of ↵Linus Torvalds8-7/+14
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four small fixes. Three are in drivers for fairly obvious bugs. The fourth is a set of regressions introduced by the compat_ioctl changes because some of the compat updates wrongly replaced .ioctl instead of .compat_ioctl" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places scsi: zfcp: fix wrong data and display format of SFP+ temperature scsi: sd_sbc: Fix sd_zbc_report_zones() scsi: libfc: free response frame from GPN_ID
2020-02-28Merge tag 'pci-v5.6-fixes-2' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix build issue on 32-bit ARM with old compilers (Marek Szyprowski) - Update MAINTAINERS for recent Cadence driver file move (Lukas Bulwahn) * tag 'pci-v5.6-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: MAINTAINERS: Correct Cadence PCI driver path PCI: brcmstb: Fix build on 32bit ARM platforms with older compilers
2020-02-28Merge tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-blockLinus Torvalds12-70/+136
Pull block fixes from Jens Axboe: - Passthrough insertion fix (Ming) - Kill off some unused arguments (John) - blktrace RCU fix (Jan) - Dead fields removal for null_blk (Dongli) - NVMe polled IO fix (Bijan) * tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-block: nvme-pci: Hold cq_poll_lock while completing CQEs blk-mq: Remove some unused function arguments null_blk: remove unused fields in 'nullb_cmd' blktrace: Protect q->blk_trace with RCU blk-mq: insert passthrough request into hctx->dispatch directly
2020-02-28Merge tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-blockLinus Torvalds3-91/+74
Pull io_uring fixes from Jens Axboe: - Fix for a race with IOPOLL used with SQPOLL (Xiaoguang) - Only show ->fdinfo if procfs is enabled (Tobias) - Fix for a chain with multiple personalities in the SQEs - Fix for a missing free of personality idr on exit - Removal of the spin-for-work optimization - Fix for next work lookup on request completion - Fix for non-vec read/write result progation in case of links - Fix for a fileset references on switch - Fix for a recvmsg/sendmsg 32-bit compatability mode * tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-block: io_uring: fix 32-bit compatability with sendmsg/recvmsg io_uring: define and set show_fdinfo only if procfs is enabled io_uring: drop file set ref put/get on switch io_uring: import_single_range() returns 0/-ERROR io_uring: pick up link work on submit reference drop io-wq: ensure work->task_pid is cleared on init io-wq: remove spin-for-work optimization io_uring: fix poll_list race for SETUP_IOPOLL|SETUP_SQPOLL io_uring: fix personality idr leak io_uring: handle multiple personalities in link chains
2020-02-28Merge branch 'nvme-5.6-rc4' of git://git.infradead.org/nvme into block-5.6Jens Axboe1-1/+1
Pull NVMe fix from Keith. * 'nvme-5.6-rc4' of git://git.infradead.org/nvme: nvme-pci: Hold cq_poll_lock while completing CQEs
2020-02-28Merge tag 'acpi-5.6-rc4' of ↵Linus Torvalds4-5/+42
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "Fix a couple of configuration issues in the ACPI watchdog (WDAT) driver (Mika Westerberg) and make it possible to disable that driver at boot time in case it still does not work as expected (Jean Delvare)" * tag 'acpi-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: watchdog: Set default timeout in probe ACPI: watchdog: Fix gas->access_width usage ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro ACPI: watchdog: Allow disabling WDAT at boot
2020-02-28Merge tag 'pm-5.6-rc4' of ↵Linus Torvalds4-7/+12
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "Fix a recent cpufreq initialization regression (Rafael Wysocki), revert a devfreq commit that made incompatible changes and broke user land on some systems (Orson Zhai), drop a stale reference to a document that has gone away recently (Jonathan Neuschäfer), and fix a typo in a hibernation code comment (Alexandre Belloni)" * tag 'pm-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Fix policy initialization for internal governor drivers Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs" PM / hibernate: fix typo "reserverd_size" -> "reserved_size" Documentation: power: Drop reference to interface.rst
2020-02-28Merge tag 'zonefs-5.6-rc4' of ↵Linus Torvalds2-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs fixes from Damien Le Moal: "Two fixes in here: - Revert the initial decision to silently ignore IOCB_NOWAIT for asynchronous direct IOs to sequential zone files. Instead, return an error to the user to signal that the feature is not supported (from Christoph) - A fix to zonefs Kconfig to select FS_IOMAP to avoid build failures if no other file system already selected this option (from Johannes)" * tag 'zonefs-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: select FS_IOMAP zonefs: fix IOCB_NOWAIT handling
2020-02-28Merge tag 'kvmarm-fixes-5.6-1' of ↵Paolo Bonzini15-77/+84
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm fixes for 5.6, take #1 - Fix compilation on 32bit - Move VHE guest entry/exit into the VHE-specific entry code - Make sure all functions called by the non-VHE HYP code is tagged as __always_inline
2020-02-28kvm: x86: Limit the number of "kvm: disabled by bios" messagesErwan Velu1-2/+2
In older version of systemd(219), at boot time, udevadm is called with : /usr/bin/udevadm trigger --type=devices --action=add" This program generates an echo "add" in /sys/devices/system/cpu/cpu<x>/uevent, leading to the "kvm: disabled by bios" message in case of your Bios disabled the virtualization extensions. On a modern system running up to 256 CPU threads, this pollutes the Kernel logs. This patch offers to ratelimit this message to avoid any userspace program triggering this uevent printing this message too often. This patch is only a workaround but greatly reduce the pollution without breaking the current behavior of printing a message if some try to instantiate KVM on a system that doesn't support it. Note that recent versions of systemd (>239) do not have trigger this behavior. This patch will be useful at least for some using older systemd with recent Kernels. Signed-off-by: Erwan Velu <e.velu@criteo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28Merge branches 'pm-sleep' and 'pm-devfreq'Rafael J. Wysocki3-5/+2
* pm-sleep: PM / hibernate: fix typo "reserverd_size" -> "reserved_size" Documentation: power: Drop reference to interface.rst * pm-devfreq: Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
2020-02-28KVM: x86: avoid useless copy of cpufreq policyPaolo Bonzini1-5/+5
struct cpufreq_policy is quite big and it is not a good idea to allocate one on the stack. Just use cpufreq_cpu_get and cpufreq_cpu_put which is even simpler. Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: allow disabling -WerrorPaolo Bonzini2-1/+14
Restrict -Werror to well-tested configurations and allow disabling it via Kconfig. Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: x86: allow compiling as non-module with W=1Valdis Klētnieks2-0/+4
Compile error with CONFIG_KVM_INTEL=y and W=1: CC arch/x86/kvm/vmx/vmx.o arch/x86/kvm/vmx/vmx.c:68:32: error: 'vmx_cpu_id' defined but not used [-Werror=unused-const-variable=] 68 | static const struct x86_cpu_id vmx_cpu_id[] = { | ^~~~~~~~~~ cc1: all warnings being treated as errors When building with =y, the MODULE_DEVICE_TABLE macro doesn't generate a reference to the structure (or any code at all). This makes W=1 compiles unhappy. Wrap both in a #ifdef to avoid the issue. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> [Do the same for CONFIG_KVM_AMD. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipisWanpeng Li1-12/+21
Nick Desaulniers Reported: When building with: $ make CC=clang arch/x86/ CFLAGS=-Wframe-larger-than=1000 The following warning is observed: arch/x86/kernel/kvm.c:494:13: warning: stack frame size of 1064 bytes in function 'kvm_send_ipi_mask_allbutself' [-Wframe-larger-than=] static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector) ^ Debugging with: https://github.com/ClangBuiltLinux/frame-larger-than via: $ python3 frame_larger_than.py arch/x86/kernel/kvm.o \ kvm_send_ipi_mask_allbutself points to the stack allocated `struct cpumask newmask` in `kvm_send_ipi_mask_allbutself`. The size of a `struct cpumask` is potentially large, as it's CONFIG_NR_CPUS divided by BITS_PER_LONG for the target architecture. CONFIG_NR_CPUS for X86_64 can be as high as 8192, making a single instance of a `struct cpumask` 1024 B. This patch fixes it by pre-allocate 1 cpumask variable per cpu and use it for both pv tlb and pv ipis.. Reported-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: Introduce pv check helpersWanpeng Li1-10/+24
Introduce some pv check helpers for consistency. Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: let declaration of kvm_get_running_vcpus match implementationChristian Borntraeger1-1/+1
Sparse notices that declaration and implementation do not match: arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: warning: incorrect type in return expression (different address spaces) arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: expected struct kvm_vcpu [noderef] <asn:3> ** arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: got struct kvm_vcpu *[noderef] <asn:3> * Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: SVM: allocate AVIC data structures based on kvm_amd module parameterPaolo Bonzini1-1/+2
Even if APICv is disabled at startup, the backing page and ir_list need to be initialized in case they are needed later. The only case in which this can be skipped is for userspace irqchip, and that must be done because avic_init_backing_page dereferences vcpu->arch.apic (which is NULL for userspace irqchip). Tested-by: rmuncrief@humanavance.com Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206579 Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-27Merge tag 'drm-fixes-2020-02-28' of git://anongit.freedesktop.org/drm/drmLinus Torvalds14-43/+138
Pull drm fixes from Dave Airlie: "Just some fixes for this week: amdgpu, radeon and i915. The main i915 one is a regression Gen7 (Ivybridge/Haswell), this moves them back from trying to use the full-ppgtt support to the aliasing version it used to use due to gpu hangs. Otherwise it's pretty quiet. amdgpu: - Drop DRIVER_USE_AGP - Fix memory leak in GPU reset - Resume fix for raven radeon: - Drop DRIVER_USE_AGP i915: - downgrade gen7 back to aliasing-ppgtt to avoid GPU hangs - shrinker fix - pmu leak and double free fixes - gvt user after free and virtual display reset fixes - randconfig build fix" * tag 'drm-fixes-2020-02-28' of git://anongit.freedesktop.org/drm/drm: drm/radeon: Inline drm_get_pci_dev drm/amdgpu: Drop DRIVER_USE_AGP drm/i915: Avoid recursing onto active vma from the shrinker drm/i915/pmu: Avoid using globals for PMU events drm/i915/pmu: Avoid using globals for CPU hotplug state drm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt drm/i915: fix header test with GCOV amdgpu/gmc_v9: save/restore sdpif regs during S3 drm/amdgpu: fix memory leak during TDR test(v2) drm/i915/gvt: Fix orphan vgpu dmabuf_objs' lifetime drm/i915/gvt: Separate display reset from ALL_ENGINES reset
2020-02-28Merge tag 'drm-intel-fixes-2020-02-27' of ↵Dave Airlie7-38/+46
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.6-rc4: - downgrade gen7 back to aliasing-ppgtt to avoid GPU hangs - shrinker fix - pmu leak and double free fixes - gvt user after free and virtual display reset fixes - randconfig build fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/874kvcsh00.fsf@intel.com
2020-02-28Merge tag 'amd-drm-fixes-5.6-2020-02-26' of ↵Dave Airlie7-5/+92
git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.6-2020-02-26: amdgpu: - Drop DRIVER_USE_AGP - Fix memory leak in GPU reset - Resume fix for raven radeon: - Drop DRIVER_USE_AGP Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227034106.3912-1-alexander.deucher@amd.com
2020-02-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds70-536/+1062
Pull networking fixes from David Miller: 1) Fix leak in nl80211 AP start where we leak the ACL memory, from Johannes Berg. 2) Fix double mutex unlock in mac80211, from Andrei Otcheretianski. 3) Fix RCU stall in ipset, from Jozsef Kadlecsik. 4) Fix devlink locking in devlink_dpipe_table_register, from Madhuparna Bhowmik. 5) Fix race causing TX hang in ll_temac, from Esben Haabendal. 6) Stale eth hdr pointer in br_dev_xmit(), from Nikolay Aleksandrov. 7) Fix TX hash calculation bounds checking wrt. tc rules, from Amritha Nambiar. 8) Size netlink responses properly in schedule action code to take into consideration TCA_ACT_FLAGS. From Jiri Pirko. 9) Fix firmware paths for mscc PHY driver, from Antoine Tenart. 10) Don't register stmmac notifier multiple times, from Aaro Koskinen. 11) Various rmnet bug fixes, from Taehee Yoo. 12) Fix vsock deadlock in vsock transport release, from Stefano Garzarella. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits) net: dsa: mv88e6xxx: Fix masking of egress port mlxsw: pci: Wait longer before accessing the device after reset sfc: fix timestamp reconstruction at 16-bit rollover points vsock: fix potential deadlock in transport->release() unix: It's CONFIG_PROC_FS not CONFIG_PROCFS net: rmnet: fix packet forwarding in rmnet bridge mode net: rmnet: fix bridge mode bugs net: rmnet: use upper/lower device infrastructure net: rmnet: do not allow to change mux id if mux id is duplicated net: rmnet: remove rcu_read_lock in rmnet_force_unassociate_device() net: rmnet: fix suspicious RCU usage net: rmnet: fix NULL pointer dereference in rmnet_changelink() net: rmnet: fix NULL pointer dereference in rmnet_newlink() net: phy: marvell: don't interpret PHY status unless resolved mlx5: register lag notifier for init network namespace only unix: define and set show_fdinfo only if procfs is enabled hinic: fix a bug of rss configuration hinic: fix a bug of setting hw_ioctxt hinic: fix a irq affinity bug net/smc: check for valid ib_client_data ...
2020-02-27MAINTAINERS: Correct Cadence PCI driver pathLukas Bulwahn1-1/+1
de80f95ccb9c ("PCI: cadence: Move all files to per-device cadence directory") moved files of the PCI cadence drivers, but did not update the MAINTAINERS entry. Since then, ./scripts/get_maintainer.pl --self-test complains: warning: no file matches F: drivers/pci/controller/pcie-cadence* Repair the MAINTAINERS entry. Link: https://lore.kernel.org/r/20200221185402.4703-1-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-02-27io_uring: fix 32-bit compatability with sendmsg/recvmsgJens Axboe1-0/+10
We must set MSG_CMSG_COMPAT if we're in compatability mode, otherwise the iovec import for these commands will not do the right thing and fail the command with -EINVAL. Found by running the test suite compiled as 32-bit. Cc: stable@vger.kernel.org Fixes: aa1fa28fc73e ("io_uring: add support for recvmsg()") Fixes: 0fa03c624d8f ("io_uring: add support for sendmsg()") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-02-27net: dsa: mv88e6xxx: Fix masking of egress portAndrew Lunn1-2/+2
Add missing ~ to the usage of the mask. Reported-by: Kevin Benson <Kevin.Benson@zii.aero> Reported-by: Chris Healy <Chris.Healy@zii.aero> Fixes: 5c74c54ce6ff ("net: dsa: mv88e6xxx: Split monitor port configuration") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27mlxsw: pci: Wait longer before accessing the device after resetAmit Cohen1-1/+1
During initialization the driver issues a reset to the device and waits for 100ms before checking if the firmware is ready. The waiting is necessary because before that the device is irresponsive and the first read can result in a completion timeout. While 100ms is sufficient for Spectrum-1 and Spectrum-2, it is insufficient for Spectrum-3. Fix this by increasing the timeout to 200ms. Fixes: da382875c616 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC") Signed-off-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27sfc: fix timestamp reconstruction at 16-bit rollover pointsAlex Maftei (amaftei)1-3/+35
We can't just use the top bits of the last sync event as they could be off-by-one every 65,536 seconds, giving an error in reconstruction of 65,536 seconds. This patch uses the difference in the bottom 16 bits (mod 2^16) to calculate an offset that needs to be applied to the last sync event to get to the current time. Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com> Acked-by: Martin Habets <mhabets@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27vsock: fix potential deadlock in transport->release()Stefano Garzarella3-13/+12
Some transports (hyperv, virtio) acquire the sock lock during the .release() callback. In the vsock_stream_connect() we call vsock_assign_transport(); if the socket was previously assigned to another transport, the vsk->transport->release() is called, but the sock lock is already held in the vsock_stream_connect(), causing a deadlock reported by syzbot: INFO: task syz-executor280:9768 blocked for more than 143 seconds. Not tainted 5.6.0-rc1-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor280 D27912 9768 9766 0x00000000 Call Trace: context_switch kernel/sched/core.c:3386 [inline] __schedule+0x934/0x1f90 kernel/sched/core.c:4082 schedule+0xdc/0x2b0 kernel/sched/core.c:4156 __lock_sock+0x165/0x290 net/core/sock.c:2413 lock_sock_nested+0xfe/0x120 net/core/sock.c:2938 virtio_transport_release+0xc4/0xd60 net/vmw_vsock/virtio_transport_common.c:832 vsock_assign_transport+0xf3/0x3b0 net/vmw_vsock/af_vsock.c:454 vsock_stream_connect+0x2b3/0xc70 net/vmw_vsock/af_vsock.c:1288 __sys_connect_file+0x161/0x1c0 net/socket.c:1857 __sys_connect+0x174/0x1b0 net/socket.c:1874 __do_sys_connect net/socket.c:1885 [inline] __se_sys_connect net/socket.c:1882 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1882 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe To avoid this issue, this patch remove the lock acquiring in the .release() callback of hyperv and virtio transports, and it holds the lock when we call vsk->transport->release() in the vsock core. Reported-by: syzbot+731710996d79d0d58fbc@syzkaller.appspotmail.com Fixes: 408624af4c89 ("vsock: use local transport when it is loaded") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27unix: It's CONFIG_PROC_FS not CONFIG_PROCFSDavid S. Miller1-1/+1
Fixes: 3a12500ed5dd ("unix: define and set show_fdinfo only if procfs is enabled") Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27Merge branch 'net-rmnet-fix-several-bugs'David S. Miller5-107/+98
Taehee Yoo says: ==================== net: rmnet: fix several bugs This patchset is to fix several bugs in RMNET module. 1. The first patch fixes NULL-ptr-deref in rmnet_newlink(). When rmnet interface is being created, it uses IFLA_LINK without checking NULL. So, if userspace doesn't set IFLA_LINK, panic will occur. In this patch, checking NULL pointer code is added. 2. The second patch fixes NULL-ptr-deref in rmnet_changelink(). To get real device in rmnet_changelink(), it uses IFLA_LINK. But, IFLA_LINK should not be used in rmnet_changelink(). 3. The third patch fixes suspicious RCU usage in rmnet_get_port(). rmnet_get_port() uses rcu_dereference_rtnl(). But, rmnet_get_port() is used by datapath. So, rcu_dereference_bh() should be used instead of rcu_dereference_rtnl(). 4. The fourth patch fixes suspicious RCU usage in rmnet_force_unassociate_device(). RCU critical section should not be scheduled. But, unregister_netdevice_queue() in the rmnet_force_unassociate_device() would be scheduled. So, the RCU warning occurs. In this patch, the rcu_read_lock() in the rmnet_force_unassociate_device() is removed because it's unnecessary. 5. The fifth patch fixes duplicate MUX ID case. RMNET MUX ID is unique. So, rmnet interface isn't allowed to be created, which have a duplicate MUX ID. But, only rmnet_newlink() checks this condition, rmnet_changelink() doesn't check this. So, duplicate MUX ID case would happen. 6. The sixth patch fixes upper/lower interface relationship problems. When IFLA_LINK is used, the upper/lower infrastructure should be used. Because it checks the maximum depth of upper/lower interfaces and it also checks circular interface relationship, etc. In this patch, netdev_upper_dev_link() is used. 7. The seventh patch fixes bridge related problems. a) ->ndo_del_slave() doesn't work. b) It couldn't detect circular upper/lower interface relationship. c) It couldn't prevent stack overflow because of too deep depth of upper/lower interface d) It doesn't check the number of lower interfaces. e) Panics because of several reasons. These problems are actually the same problem. So, this patch fixes these problems. 8. The eighth patch fixes packet forwarding issue in bridge mode Packet forwarding is not working in rmnet bridge mode. Because when a packet is forwarded, skb_push() for an ethernet header is needed. But it doesn't call skb_push(). So, the ethernet header will be lost. Change log: - update commit logs. - drop two patches in this patchset because of wrong target branch. - ("net: rmnet: add missing module alias") - ("net: rmnet: print error message when command fails") - remove unneessary rcu_read_lock() in the third patch. - use rcu_dereference_bh() instead of rcu_dereference in third patch. - do not allow to add a bridge device if rmnet interface is already bridge mode in the seventh patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>