summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2018-02-01Merge branch 'x86/hyperv' of ↵Radim Krčmář2721-46060/+89289
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Topic branch for stable KVM clockource under Hyper-V. Thanks to Christoffer Dall for resolving the ARM conflict.
2018-01-31kvm: x86: remove efer_reload entry in kvm_vcpu_statLongpeng(Mike)2-2/+0
The efer_reload is never used since commit 26bb0981b3ff ("KVM: VMX: Use shared msr infrastructure"), so remove it. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-01-31KVM: x86: AMD Processor Topology InformationStanislav Lanci1-1/+2
This patch allow to enable x86 feature TOPOEXT. This is needed to provide information about SMT on AMD Zen CPUs to the guest. Signed-off-by: Stanislav Lanci <pixo@polepetko.eu> Tested-by: Nick Sarnie <commendsarnex@gmail.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-01-31x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when ↵Vitaly Kuznetsov2-2/+17
running nested I was investigating an issue with seabios >= 1.10 which stopped working for nested KVM on Hyper-V. The problem appears to be in handle_ept_violation() function: when we do fast mmio we need to skip the instruction so we do kvm_skip_emulated_instruction(). This, however, depends on VM_EXIT_INSTRUCTION_LEN field being set correctly in VMCS. However, this is not the case. Intel's manual doesn't mandate VM_EXIT_INSTRUCTION_LEN to be set when EPT MISCONFIG occurs. While on real hardware it was observed to be set, some hypervisors follow the spec and don't set it; we end up advancing IP with some random value. I checked with Microsoft and they confirmed they don't fill VM_EXIT_INSTRUCTION_LEN on EPT MISCONFIG. Fix the issue by doing instruction skip through emulator when running nested. Fixes: 68c3b4d1676d870f0453c31d5a52e7e65c7448ae Suggested-by: Radim Krčmář <rkrcmar@redhat.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-01-31kvm: embed vcpu id to dentry of vcpu anon inodeMasatake YAMATO1-1/+4
All d-entries for vcpu have the same, "anon_inode:kvm-vcpu". That means it is impossible to know the mapping between fds for vcpu and vcpu from userland. # LC_ALL=C ls -l /proc/617/fd | grep vcpu lrwx------. 1 qemu qemu 64 Jan 7 16:50 18 -> anon_inode:kvm-vcpu lrwx------. 1 qemu qemu 64 Jan 7 16:50 19 -> anon_inode:kvm-vcpu It is also impossible to know the mapping between vma for kvm_run structure and vcpu from userland. # LC_ALL=C grep vcpu /proc/617/maps 7f9d842d0000-7f9d842d3000 rw-s 00000000 00:0d 20393 anon_inode:kvm-vcpu 7f9d842d3000-7f9d842d6000 rw-s 00000000 00:0d 20393 anon_inode:kvm-vcpu This change adds vcpu id to d-entries for vcpu. With this change you can get the following output: # LC_ALL=C ls -l /proc/617/fd | grep vcpu lrwx------. 1 qemu qemu 64 Jan 7 16:50 18 -> anon_inode:kvm-vcpu:0 lrwx------. 1 qemu qemu 64 Jan 7 16:50 19 -> anon_inode:kvm-vcpu:1 # LC_ALL=C grep vcpu /proc/617/maps 7f9d842d0000-7f9d842d3000 rw-s 00000000 00:0d 20393 anon_inode:kvm-vcpu:0 7f9d842d3000-7f9d842d6000 rw-s 00000000 00:0d 20393 anon_inode:kvm-vcpu:1 With the mappings known from the output, a tool like strace can report more details of qemu-kvm process activities. Here is the strace output of my local prototype: # ./strace -KK -f -p 617 2>&1 | grep 'KVM_RUN\| K' ... [pid 664] ioctl(18, KVM_RUN, 0) = 0 (KVM_EXIT_MMIO) K ready_for_interrupt_injection=1, if_flag=0, flags=0, cr8=0000000000000000, apic_base=0x000000fee00d00 K phys_addr=0, len=1634035803, [33, 0, 0, 0, 0, 0, 0, 0], is_write=112 [pid 664] ioctl(18, KVM_RUN, 0) = 0 (KVM_EXIT_MMIO) K ready_for_interrupt_injection=1, if_flag=1, flags=0, cr8=0000000000000000, apic_base=0x000000fee00d00 K phys_addr=0, len=1634035803, [33, 0, 0, 0, 0, 0, 0, 0], is_write=112 ... Signed-off-by: Masatake YAMATO <yamato@redhat.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-01-31kvm: Map PFN-type memory regions as writable (if possible)KarimAllah Ahmed1-2/+5
For EPT-violations that are triggered by a read, the pages are also mapped with write permissions (if their memory region is also writable). That would avoid getting yet another fault on the same page when a write occurs. This optimization only happens when you have a "struct page" backing the memory region. So also enable it for memory regions that do not have a "struct page". Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-01-31Merge tag 'kvm-arm-for-v4.16' of ↵Radim Krčmář37-362/+579
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm KVM/ARM Changes for v4.16 The changes for this version include icache invalidation optimizations (improving VM startup time), support for forwarded level-triggered interrupts (improved performance for timers and passthrough platform devices), a small fix for power-management notifiers, and some cosmetic changes.
2018-01-31Merge tag 'kvm-s390-next-4.16-3' of ↵Radim Krčmář1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux KVM: s390: update maintainers
2018-01-31x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=nThomas Gleixner1-4/+4
The reenlightment support for hyperv slapped a direct reference to x86_hyper_type into the kvm code which results in the following build failure when CONFIG_HYPERVISOR_GUEST=n: arch/x86/kvm/x86.c:6259:6: error: ‘x86_hyper_type’ undeclared (first use in this function) arch/x86/kvm/x86.c:6259:6: note: each undeclared identifier is reported only once for each function it appears in Use the proper helper function to cure that. The 32bit compile fails because of: arch/x86/kvm/x86.c:5936:13: warning: ‘kvm_hyperv_tsc_notifier’ defined but not used [-Wunused-function] which is a real trainwreck engineering artwork. The callsite is wrapped into #ifdef CONFIG_X86_64, but the function itself has the #ifdef inside the function body. Make the function itself wrapped into the ifdef to cure that. Qualiteee.... Fixes: 0092e4346f49 ("x86/kvm: Support Hyper-V reenlightenment") Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: devel@linuxdriverproject.org Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Cathy Avery <cavery@redhat.com> Cc: Mohammed Gamal <mmorsy@redhat.com>
2018-01-31KVM: arm/arm64: Fixup userspace irqchip static key optimizationChristoffer Dall1-1/+1
When I introduced a static key to avoid work in the critical path for userspace irqchips which is very rarely used, I accidentally messed up my logic and used && where I should have used ||, because the point was to short-circuit the evaluation in case userspace irqchips weren't even in use. This fixes an issue when running in-kernel irqchip VMs alongside userspace irqchip VMs. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Fixes: c44c232ee2d3 ("KVM: arm/arm64: Avoid work when userspace iqchips are not used") Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2018-01-31KVM: arm/arm64: Fix userspace_irqchip_in_use countingChristoffer Dall1-2/+3
We were not decrementing the static key count in the right location. kvm_arch_vcpu_destroy() is only called to clean up after a failed VCPU create attempt, whereas kvm_arch_vcpu_free() is called on teardown of the VM as well. Move the static key decrement call to kvm_arch_vcpu_free(). Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2018-01-31KVM: arm/arm64: Fix incorrect timer_is_pending logicChristoffer Dall1-19/+17
After the recently introduced support for level-triggered mapped interrupt, I accidentally left the VCPU thread busily going back and forward between the guest and the hypervisor whenever the guest was blocking, because I would always incorrectly report that a timer interrupt was pending. This is because the timer->irq.level field is not valid for mapped interrupts, where we offload the level state to the hardware, and as a result this field is always true. Luckily the problem can be relatively easily solved by not checking the cached signal state of either timer in kvm_timer_should_fire() but instead compute the timer state on the fly, which we do already if the cached signal state wasn't high. In fact, the only reason for checking the cached signal state was a tiny optimization which would only be potentially faster when the polling loop detects a pending timer interrupt, which is quite unlikely. Instead of duplicating the logic from kvm_arch_timer_handler(), we enlighten kvm_timer_should_fire() to report something valid when the timer state is loaded onto the hardware. We can then call this from kvm_arch_timer_handler() as well and avoid the call to __timer_snapshot_state() in kvm_arch_timer_get_input_level(). Reported-by: Tomasz Nowicki <tn@semihalf.com> Tested-by: Tomasz Nowicki <tn@semihalf.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2018-01-31MAINTAINERS: update KVM/s390 maintainersCornelia Huck1-1/+2
As I have neither too much time nor access to the architecture documentation anymore, let's switch my status from maintainer to reviewer. Janosch will step in as second maintainer. Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-01-31MAINTAINERS: add Halil as additional vfio-ccw maintainerCornelia Huck1-0/+1
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-01-31MAINTAINERS: add David as a reviewer for KVM/s390Cornelia Huck1-0/+1
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-01-30x86/kvm: Support Hyper-V reenlightenmentVitaly Kuznetsov1-0/+45
When running nested KVM on Hyper-V guests its required to update masterclocks for all guests when L1 migrates to a host with different TSC frequency. Implement the procedure in the following way: - Pause all guests. - Tell the host (Hyper-V) to stop emulating TSC accesses. - Update the gtod copy, recompute clocks. - Unpause all guests. This is somewhat similar to cpufreq but there are two important differences: - TSC emulation can only be disabled globally (on all CPUs) - The new TSC frequency is not known until emulation is turned off so there is no way to 'prepare' for the event upfront. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: devel@linuxdriverproject.org Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Cathy Avery <cavery@redhat.com> Cc: Mohammed Gamal <mmorsy@redhat.com> Link: https://lkml.kernel.org/r/20180124132337.30138-8-vkuznets@redhat.com
2018-01-30x86/kvm: Pass stable clocksource to guests when running nested on Hyper-VVitaly Kuznetsov1-25/+68
Currently, KVM is able to work in 'masterclock' mode passing PVCLOCK_TSC_STABLE_BIT to guests when the clocksource which is used on the host is TSC. When running nested on Hyper-V the guest normally uses a different one: TSC page which is resistant to TSC frequency changes on events like L1 migration. Add support for it in KVM. The only non-trivial change is in vgettsc(): when updating the gtod copy both the clock readout and tsc value have to be updated now. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: devel@linuxdriverproject.org Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Cathy Avery <cavery@redhat.com> Cc: Mohammed Gamal <mmorsy@redhat.com> Link: https://lkml.kernel.org/r/20180124132337.30138-7-vkuznets@redhat.com
2018-01-30x86/irq: Count Hyper-V reenlightenment interruptsVitaly Kuznetsov3-0/+14
Hyper-V reenlightenment interrupts arrive when the VM is migrated, While they are not interesting in general it's important when L2 nested guests are running. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: devel@linuxdriverproject.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Cathy Avery <cavery@redhat.com> Cc: Mohammed Gamal <mmorsy@redhat.com> Link: https://lkml.kernel.org/r/20180124132337.30138-6-vkuznets@redhat.com
2018-01-30x86/hyperv: Redirect reenlightment notifications on CPU offliningVitaly Kuznetsov1-1/+21
It is very unlikely for CPUs to get offlined when running on Hyper-V as there is a protection in the vmbus module which prevents it when the guest has any VMBus devices assigned. This, however, may change in future if an option to reassign an already active channel will be added. It is also possible to run without any Hyper-V devices or to have a CPU with no assigned channels. Reassign reenlightenment notifications to some other active CPU when the CPU which is assigned to them goes offline. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: devel@linuxdriverproject.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Cathy Avery <cavery@redhat.com> Cc: Mohammed Gamal <mmorsy@redhat.com> Link: https://lkml.kernel.org/r/20180124132337.30138-5-vkuznets@redhat.com
2018-01-30x86/hyperv: Reenlightenment notifications supportVitaly Kuznetsov7-1/+143
Hyper-V supports Live Migration notification. This is supposed to be used in conjunction with TSC emulation: when a VM is migrated to a host with different TSC frequency for some short period the host emulates the accesses to TSC and sends an interrupt to notify about the event. When the guest is done updating everything it can disable TSC emulation and everything will start working fast again. These notifications weren't required until now as Hyper-V guests are not supposed to use TSC as a clocksource: in Linux the TSC is even marked as unstable on boot. Guests normally use 'tsc page' clocksource and host updates its values on migrations automatically. Things change when with nested virtualization: even when the PV clocksources (kvm-clock or tsc page) are passed through to the nested guests the TSC frequency and frequency changes need to be know.. Hyper-V Top Level Functional Specification (as of v5.0b) wrongly specifies EAX:BIT(12) of CPUID:0x40000009 as the feature identification bit. The right one to check is EAX:BIT(13) of CPUID:0x40000003. I was assured that the fix in on the way. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: devel@linuxdriverproject.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Cathy Avery <cavery@redhat.com> Cc: Mohammed Gamal <mmorsy@redhat.com> Link: https://lkml.kernel.org/r/20180124132337.30138-4-vkuznets@redhat.com
2018-01-30x86/hyperv: Add a function to read both TSC and TSC page value simulateneouslyVitaly Kuznetsov2-4/+20
This is going to be used from KVM code where both TSC and TSC page value are needed. Nothing is supposed to use the function when Hyper-V code is compiled out, just BUG(). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: devel@linuxdriverproject.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Cathy Avery <cavery@redhat.com> Cc: Mohammed Gamal <mmorsy@redhat.com> Link: https://lkml.kernel.org/r/20180124132337.30138-3-vkuznets@redhat.com
2018-01-30x86/hyperv: Check for required priviliges in hyperv_init()Vitaly Kuznetsov1-1/+8
In hyperv_init() its presumed that it always has access to VP index and hypercall MSRs while according to the specification it should be checked if it's allowed to access the corresponding MSRs before accessing them. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: devel@linuxdriverproject.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Cathy Avery <cavery@redhat.com> Cc: Mohammed Gamal <mmorsy@redhat.com> Link: https://lkml.kernel.org/r/20180124132337.30138-2-vkuznets@redhat.com
2018-01-30Merge branch 'x86-hyperv-for-linus' of ↵Linus Torvalds1-2/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 hyperv update from Ingo Molnar: "Enable PCID support on Hyper-V guests" * 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hyperv: Stop suppressing X86_FEATURE_PCID
2018-01-30Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds9-33/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Misc cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Remove unused IOMMU_STRESS Kconfig x86/extable: Mark exception handler functions visible x86/timer: Don't inline __const_udelay x86/headers: Remove duplicate #includes
2018-01-30Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds1-6/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic cleanup from Ingo Molnar: "A single change simplifying the APIC code bit" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Remove local var in flat_send_IPI_allbutself()
2018-01-30Merge branch 'sched-core-for-linus' of ↵Linus Torvalds12-161/+339
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Implement frequency/CPU invariance and OPP selection for SCHED_DEADLINE (Juri Lelli) - Tweak the task migration logic for better multi-tasking workload scalability (Mel Gorman) - Misc cleanups, fixes and improvements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Make bandwidth enforcement scale-invariant sched/cpufreq: Move arch_scale_{freq,cpu}_capacity() outside of #ifdef CONFIG_SMP sched/cpufreq: Remove arch_scale_freq_capacity()'s 'sd' parameter sched/cpufreq: Always consider all CPUs when deciding next freq sched/cpufreq: Split utilization signals sched/cpufreq: Change the worker kthread to SCHED_DEADLINE sched/deadline: Move CPU frequency selection triggering points sched/cpufreq: Use the DEADLINE utilization signal sched/deadline: Implement "runtime overrun signal" support sched/fair: Only immediately migrate tasks due to interrupts if prev and target CPUs share cache sched/fair: Correct obsolete comment about cpufreq_update_util() sched/fair: Remove impossible condition from find_idlest_group_cpu() sched/cpufreq: Don't pass flags to sugov_set_iowait_boost() sched/cpufreq: Initialize sg_cpu->flags to 0 sched/fair: Consider RT/IRQ pressure in capacity_spare_wake() sched/fair: Use 'unsigned long' for utilization, consistently sched/core: Rework and clarify prepare_lock_switch() sched/fair: Remove unused 'curr' parameter from wakeup_gran sched/headers: Constify object_is_on_stack()
2018-01-30Merge branch 'ras-core-for-linus' of ↵Linus Torvalds4-14/+60
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS updates from Ingo Molnar: - various AMD SMCA error parsing/reporting improvements (Yazen Ghannam) - extend Intel CMCI error reporting to more cases (Xie XiuQi) * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/MCE: Make correctable error detection look at the Deferred bit x86/MCE: Report only DRAM ECC as memory errors on AMD systems x86/MCE/AMD: Define a function to get SMCA bank type x86/mce/AMD: Don't set DEF_INT_TYPE in MSR_CU_DEF_ERR on SMCA systems x86/MCE: Extend table to report action optional errors through CMCI too
2018-01-30Merge branch 'perf-core-for-linus' of ↵Linus Torvalds270-14728/+18017
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Kernel side changes: - Clean up the x86 instruction decoder (Masami Hiramatsu) - Add new uprobes optimization for PUSH instructions on x86 (Yonghong Song) - Add MSR_IA32_THERM_STATUS to the MSR events (Stephane Eranian) - Fix misc bugs, update documentation, plus various cleanups (Jiri Olsa) There's a large number of tooling side improvements: - Intel-PT/BTS improvements (Adrian Hunter) - Numerous 'perf trace' improvements (Arnaldo Carvalho de Melo) - Introduce an errno code to string facility (Hendrik Brueckner) - Various build system improvements (Jiri Olsa) - Add support for CoreSight trace decoding by making the perf tools use the external openCSD (Mathieu Poirier, Tor Jeremiassen) - Add ARM Statistical Profiling Extensions (SPE) support (Kim Phillips) - libtraceevent updates (Steven Rostedt) - Intel vendor event JSON updates (Andi Kleen) - Introduce 'perf report --mmaps' and 'perf report --tasks' to show info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo) - Add infrastructure to record first and last sample time to the perf.data file header, so that when processing all samples in a 'perf record' session, such as when doing build-id processing, or when specifically requesting that that info be recorded, use that in 'perf report --time', that also got support for percent slices in addition to absolute ones. I.e. now it is possible to ask for the samples in the 10%-20% time slice of a perf.data file (Jin Yao) - Allow system wide 'perf stat --per-thread', sorting the result (Jin Yao) E.g.: [root@jouet ~]# perf stat --per-thread --metrics IPC ^C Performance counter stats for 'system wide': make-22229 23,012,094,032 inst_retired.any # 0.8 IPC cc1-22419 692,027,497 inst_retired.any # 0.8 IPC gcc-22418 328,231,855 inst_retired.any # 0.9 IPC cc1-22509 220,853,647 inst_retired.any # 0.8 IPC gcc-22486 199,874,810 inst_retired.any # 1.0 IPC as-22466 177,896,365 inst_retired.any # 0.9 IPC cc1-22465 150,732,374 inst_retired.any # 0.8 IPC gcc-22508 112,555,593 inst_retired.any # 0.9 IPC cc1-22487 108,964,079 inst_retired.any # 0.7 IPC qemu-system-x86-2697 21,330,550 inst_retired.any # 0.3 IPC systemd-journal-551 20,642,951 inst_retired.any # 0.4 IPC docker-containe-17651 9,552,892 inst_retired.any # 0.5 IPC dockerd-current-9809 7,528,586 inst_retired.any # 0.5 IPC make-22153 12,504,194,380 inst_retired.any # 0.8 IPC python2-22429 12,081,290,954 inst_retired.any # 0.8 IPC <SNIP> python2-22429 15,026,328,103 cpu_clk_unhalted.thread cc1-22419 826,660,193 cpu_clk_unhalted.thread gcc-22418 365,321,295 cpu_clk_unhalted.thread cc1-22509 279,169,362 cpu_clk_unhalted.thread gcc-22486 210,156,950 cpu_clk_unhalted.thread <SNIP> 5.638075538 seconds time elapsed [root@jouet ~]# - Improve shell auto-completion of perf events (Jin Yao) - 'perf probe' improvements (Masami Hiramatsu) - Improve PMU infrastructure to support amp64's ThunderX2 implementation defined core events (Ganapatrao Kulkarni) - Various annotation related improvements and fixes (Thomas Richter) - Clarify usage of 'overwrite' and 'backward' in the evlist/mmap code, removing the 'overwrite' parameter from several functions as it was always used it as 'false' (Wang Nan) - Fix/improve 'perf record' reverse recording support (Wang Nan) - Improve command line options documentation (Sihyeon Jang) - Optimize sample parsing for ordering events, where we don't need to parse all the PERF_SAMPLE_ bits, just the ones leading to the timestamp needed to reorder events (Jiri Olsa) - Generalize the annotation code to support other source information besides objdump/DWARF obtained ones, starting with python scripts, that will is slated to be merged soon (Jiri Olsa) - ... and a lot more that I failed to list, see the shortlog and changelog for details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits) perf trace beauty flock: Move to separate object file perf evlist: Remove fcntl.h from evlist.h perf trace beauty futex: Beautify FUTEX_BITSET_MATCH_ANY perf trace: Do not print from time delta for interrupted syscall lines perf trace: Add --print-sample perf bpf: Remove misplaced __maybe_unused attribute MAINTAINERS: Adding entry for CoreSight trace decoding perf tools: Add mechanic to synthesise CoreSight trace packets perf tools: Add full support for CoreSight trace decoding pert tools: Add queue management functionality perf tools: Add functionality to communicate with the openCSD decoder perf tools: Add support for decoding CoreSight trace data perf tools: Add decoder mechanic to support dumping trace data perf tools: Add processing of coresight metadata perf tools: Add initial entry point for decoder CoreSight traces perf tools: Integrating the CoreSight decoding library perf vendor events intel: Update IvyTown files to V20 perf vendor events intel: Update IvyBridge files to V20 perf vendor events intel: Update BroadwellDE events to V7 perf vendor events intel: Update SkylakeX events to V1.06 ...
2018-01-30Merge branch 'locking-core-for-linus' of ↵Linus Torvalds5-50/+53
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes relate to making lock_is_held() et al (and external wrappers of them) work on const data types - this requires const propagation through the depths of lockdep. This removes a number of ugly type hacks the external helpers used" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lockdep: Convert some users to const lockdep: Make lockdep checking constant lockdep: Assign lock keys on registration
2018-01-30Merge branch 'efi-core-for-linus' of ↵Linus Torvalds8-123/+422
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The biggest change in this cycle was the addition of ARM CPER error decoding when printing EFI errors into the kernel log. There are also misc smaller updates: documentation update, cleanups and an EFI memory map permissions quirk" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Clarify that reset attack mitigation needs appropriate userspace efi: Parse ARM error information value efi: Move ARM CPER code to new file efi: Use PTR_ERR_OR_ZERO() arm64/efi: Ignore EFI_MEMORY_XP attribute if RP and/or WP are set efi/capsule-loader: Fix pr_err() string to end with newline
2018-01-30Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds81-748/+501
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main RCU changes in this cycle were: - Updates to use cond_resched() instead of cond_resched_rcu_qs() where feasible (currently everywhere except in kernel/rcu and in kernel/torture.c). Also a couple of fixes to avoid sending IPIs to offline CPUs. - Updates to simplify RCU's dyntick-idle handling. - Updates to remove almost all uses of smp_read_barrier_depends() and read_barrier_depends(). - Torture-test updates. - Miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) torture: Save a line in stutter_wait(): while -> for torture: Eliminate torture_runnable and perf_runnable torture: Make stutter less vulnerable to compilers and races locking/locktorture: Fix num reader/writer corner cases locking/locktorture: Fix rwsem reader_delay torture: Place all torture-test modules in one MAINTAINERS group rcutorture/kvm-build.sh: Skip build directory check rcutorture: Simplify functions.sh include path rcutorture: Simplify logging rcutorture/kvm-recheck-*: Improve result directory readability check rcutorture/kvm.sh: Support execution from any directory rcutorture/kvm.sh: Use consistent help text for --qemu-args rcutorture/kvm.sh: Remove unused variable, `alldone` rcutorture: Remove unused script, config2frag.sh rcutorture/configinit: Fix build directory error message rcutorture: Preempt RCU-preempt readers more vigorously torture: Reduce #ifdefs for preempt_schedule() rcu: Remove have_rcu_nocb_mask from tree_plugin.h rcu: Add comment giving debug strategy for double call_rcu() tracing, rcu: Hide trace event rcu_nocb_wake when not used ...
2018-01-30Merge branch 'core-debug-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull STRICT_DEVMEM default from Ingo Molnar: "Make CONFIG_STRICT_DEVMEM default-y on x86 and arm64 as well, to follow the distro status quo" * 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Kconfig: Make STRICT_DEVMEM default-y on x86 and arm64
2018-01-30Merge tag 'kvm-s390-next-4.16-2' of ↵Radim Krčmář13-178/+496
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux KVM: s390: Fixes and features for 4.16 part 2 - exitless interrupts for emulated devices (Michael Mueller) - cleanup of cpuflag handling (David Hildenbrand) - kvm stat counter improvements (Christian Borntraeger) - vsie improvements (David Hildenbrand) - mm cleanup (Janosch Frank)
2018-01-29Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds22-100/+324
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/pti updates from Thomas Gleixner: "Another set of melted spectrum related changes: - Code simplifications and cleanups for RSB and retpolines. - Make the indirect calls in KVM speculation safe. - Whitelist CPUs which are known not to speculate from Meltdown and prepare for the new CPUID flag which tells the kernel that a CPU is not affected. - A less rigorous variant of the module retpoline check which merily warns when a non-retpoline protected module is loaded and reflects that fact in the sysfs file. - Prepare for Indirect Branch Prediction Barrier support. - Prepare for exposure of the Speculation Control MSRs to guests, so guest OSes which depend on those "features" can use them. Includes a blacklist of the broken microcodes. The actual exposure of the MSRs through KVM is still being worked on" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation: Simplify indirect_branch_prediction_barrier() x86/retpoline: Simplify vmexit_fill_RSB() x86/cpufeatures: Clean up Spectre v2 related CPUID flags x86/cpu/bugs: Make retpoline module warning conditional x86/bugs: Drop one "mitigation" from dmesg x86/nospec: Fix header guards names x86/alternative: Print unadorned pointers x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown x86/msr: Add definitions for new speculation control MSRs x86/cpufeatures: Add AMD feature bits for Speculation Control x86/cpufeatures: Add Intel feature bits for Speculation Control x86/cpufeatures: Add CPUID_7_EDX CPUID leaf module/retpoline: Warn about missing retpoline in module KVM: VMX: Make indirect call speculation safe KVM: x86: Make indirect calls in emulator speculation safe
2018-01-29Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2-2/+46
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm update from Thomas Gleixner: "A single patch which excludes the GART aperture from vmcore as accessing that area from a dump kernel can crash the kernel. Not necessarily the nicest way to fix this, but curing this from ground up requires a more thorough rewrite of the whole kexec/kdump magic" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/gart: Exclude GART aperture from vmcore
2018-01-29Merge branch 'x86-timers-for-linus' of ↵Linus Torvalds5-13/+67
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 timer updates from Thomas Gleixner: "A small set of updates for x86 specific timers: - Mark TSC invariant on a subset of Centaur CPUs - Allow TSC calibration without PIT on mobile platforms which lack legacy devices" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/centaur: Mark TSC invariant x86/tsc: Introduce early tsc clocksource x86/time: Unconditionally register legacy timer interrupt x86/tsc: Allow TSC calibration without PIT
2018-01-29Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds23-193/+1052
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Thomas Gleixner: "The platform support for x86 contains the following updates: - A set of updates for the UV platform to support new CPUs and to fix some of the UV4A BAU MRRs - The initial platform support for the jailhouse hypervisor to allow native Linux guests (inmates) in non-root cells. - A fix for the PCI initialization on Intel MID platforms" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/jailhouse: Respect pci=lastbus command line settings x86/jailhouse: Set X86_FEATURE_TSC_KNOWN_FREQ x86/platform/intel-mid: Move PCI initialization to arch_init() x86/platform/uv/BAU: Replace hard-coded values with MMR definitions x86/platform/UV: Fix UV4A BAU MMRs x86/platform/UV: Fix GAM MMR references in the UV x2apic code x86/platform/UV: Fix GAM MMR changes in UV4A x86/platform/UV: Add references to access fixed UV4A HUB MMRs x86/platform/UV: Fix UV4A support on new Intel Processors x86/platform/UV: Update uv_mmrs.h to prepare for UV4A fixes x86/jailhouse: Add PCI dependency x86/jailhouse: Hide x2apic code when CONFIG_X86_X2APIC=n x86/jailhouse: Initialize PCI support x86/jailhouse: Wire up IOAPIC for legacy UART ports x86/jailhouse: Halt instead of failing to restart x86/jailhouse: Silence ACPI warning x86/jailhouse: Avoid access of unsupported platform resources x86/jailhouse: Set up timekeeping x86/jailhouse: Enable PMTIMER x86/jailhouse: Enable APIC and SMP support ...
2018-01-29Merge branch 'x86-cache-for-linus' of ↵Linus Torvalds7-39/+169
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/cache updates from Thomas Gleixner: "A set of patches which add support for L2 cache partitioning to the Intel RDT facility" * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/intel_rdt: Add command line parameter to control L2_CDP x86/intel_rdt: Enable L2 CDP in MSR IA32_L2_QOS_CFG x86/intel_rdt: Add two new resources for L2 Code and Data Prioritization (CDP) x86/intel_rdt: Enumerate L2 Code and Data Prioritization (CDP) feature x86/intel_rdt: Add L2CDP support in documentation x86/intel_rdt: Update documentation
2018-01-29Merge branch 'timers-core-for-linus' of ↵Linus Torvalds22-523/+1123
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "The timer departement presents: - A rather large rework of the hrtimer infrastructure which introduces softirq based hrtimers to replace the spread of hrtimer/tasklet combos which force the actual callback execution into softirq context. The approach is completely different from the initial implementation which you cursed at 10 years ago rightfully. The softirq based timers have their own queues and there is no nasty indirection and list reshuffling in the hard interrupt anymore. This comes with conversion of some of the hrtimer/tasklet users, the rest and the final removal of that horrible interface will come towards the end of the merge window or go through the relevant maintainer trees. Note: The top commit merged the last minute bugfix for the 10 years old CPU hotplug bug as I wanted to make sure that I fatfinger the merge conflict resolution myself. - The overhaul of the STM32 clocksource/clockevents driver - A new driver for the Spreadtrum SC9860 timer - A new driver dor the Actions Semi S700 timer - The usual set of fixes and updates all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) usb/gadget/NCM: Replace tasklet with softirq hrtimer ALSA/dummy: Replace tasklet with softirq hrtimer hrtimer: Implement SOFT/HARD clock base selection hrtimer: Implement support for softirq based hrtimers hrtimer: Prepare handling of hard and softirq based hrtimers hrtimer: Add clock bases and hrtimer mode for softirq context hrtimer: Use irqsave/irqrestore around __run_hrtimer() hrtimer: Factor out __hrtimer_next_event_base() hrtimer: Factor out __hrtimer_start_range_ns() hrtimer: Remove the 'base' parameter from hrtimer_reprogram() hrtimer: Make remote enqueue decision less restrictive hrtimer: Unify remote enqueue handling hrtimer: Unify hrtimer removal handling hrtimer: Make hrtimer_force_reprogramm() unconditionally available hrtimer: Make hrtimer_reprogramm() unconditional hrtimer: Make hrtimer_cpu_base.next_timer handling unconditional hrtimer: Make the remote enqueue check unconditional hrtimer: Use accesor functions instead of direct access hrtimer: Make the hrtimer_cpu_base::hres_active field unconditional, to simplify the code hrtimer: Make room in 'struct hrtimer_cpu_base' ...
2018-01-29Merge branch 'irq-core-for-linus' of ↵Linus Torvalds35-220/+251
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "A rather small set of irq updates this time: - removal of the old and now obsolete irq domain debugging code - the new Goldfish PIC driver - the usual pile of small fixes and updates" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdomain: Kill CONFIG_IRQ_DOMAIN_DEBUG irq/work: Improve the flag definitions irqchip/gic-v3: Fix the driver probe() fail due to disabled GICC entry irqchip/irq-goldfish-pic: Add Goldfish PIC driver dt-bindings/goldfish-pic: Add device tree binding for Goldfish PIC driver irqchip/ompic: fix return value check in ompic_of_init() dt-bindings/bcm283x: Define polarity of per-cpu interrupts irqchip/irq-bcm2836: Add support for DT interrupt polarity dt-bindings/bcm2836-l1-intc: Add interrupt polarity support
2018-01-29Merge tag 'xtensa-20180129' of git://github.com/jcmvbkbc/linux-xtensaLinus Torvalds57-663/+829
Pull Xtensa updates from Max Filippov: - add SSP support - add KASAN support - improvements to xtensa-specific assembly: - use ENTRY and ENDPROC consistently - clean up and unify word alignment macros - clean up and unify fixup marking - use 'call' instead of 'callx' where possible - various cleanups: - consiolidate kernel stack size related definitions - replace #ifdef'fed/commented out debug printk statements with pr_debug - use struct exc_table instead of flat array for exception handling data - build kernel with -mtext-section-literals; simplify xtensa linker script - fix futex_atomic_cmpxchg_inatomic() * tag 'xtensa-20180129' of git://github.com/jcmvbkbc/linux-xtensa: (21 commits) xtensa: fix futex_atomic_cmpxchg_inatomic xtensa: shut up gcc-8 warnings xtensa: print kernel sections info in mem_init xtensa: use generic strncpy_from_user with KASAN xtensa: use __memset in __xtensa_clear_user xtensa: add support for KASAN xtensa: move fixmap and kmap just above the KSEG xtensa: don't clear swapper_pg_dir in paging_init xtensa: extract init_kio xtensa: implement early_trap_init xtensa: clean up exception handling structure xtensa: clean up custom-controlled debug output xtensa: enable stack protector xtensa: print hardware config ID on startup xtensa: consolidate kernel stack size related definitions xtensa: clean up functions in assembly code xtensa: clean up word alignment macros in assembly code xtensa: clean up fixups in assembly code xtensa: use call instead of callx in assembly code xtensa: build kernel with text-section-literals ...
2018-01-29Merge tag 'm68k-for-v4.16-tag1' of ↵Linus Torvalds26-651/+833
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - first part of an overhaul of the NuBus subsystem, to bring it up to modern driver model standards - a race condition fix for Mac - defconfig updates * tag 'm68k-for-v4.16-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: MAINTAINERS: Add NuBus subsystem entry m68k/mac: Fix race conditions in OSS interrupt dispatch nubus: Add support for the driver model nubus: Add expansion_type values for various Mac models nubus: Adopt standard linked list implementation nubus: Rename struct nubus_dev nubus: Rework /proc/bus/nubus/s/ implementation nubus: Generalize block resource handling nubus: Clean up whitespace nubus: Remove redundant code nubus: Call proc_mkdir() not more than once per slot directory nubus: Validate slot resource IDs nubus: Fix log spam nubus: Use static functions where possible nubus: Fix up header split nubus: Avoid array underflow and overflow m68k/defconfig: Update defconfigs for v4.15-rc1
2018-01-29Merge tag 'for-4.16-tag' of ↵Linus Torvalds47-1405/+2267
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "Features or user visible changes: - fallocate: implement zero range mode - avoid losing data raid profile when deleting a device - tree item checker: more checks for directory items and xattrs Notable fixes: - raid56 recovery: don't use cached stripes, that could be potentially changed and a later RMW or recovery would lead to corruptions or failures - let raid56 try harder to rebuild damaged data, reading from all stripes if necessary - fix scrub to repair raid56 in a similar way as in the case above Other: - cleanups: device freeing, removed some call indirections, redundant bio_put/_get, unused parameters, refactorings and renames - RCU list traversal fixups - simplify mount callchain, remove recursing back when mounting a subvolume - plug for fsync, may improve bio merging on multiple devices - compression heurisic: replace heap sort with radix sort, gains some performance - add extent map selftests, buffered write vs dio" * tag 'for-4.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (155 commits) btrfs: drop devid as device_list_add() arg btrfs: get device pointer from device_list_add() btrfs: set the total_devices in device_list_add() btrfs: move pr_info into device_list_add btrfs: make btrfs_free_stale_devices() to match the path btrfs: rename btrfs_free_stale_devices() arg to skip_dev btrfs: make btrfs_free_stale_devices() argument optional btrfs: make btrfs_free_stale_device() to iterate all stales btrfs: no need to check for btrfs_fs_devices::seeding btrfs: Use IS_ALIGNED in btrfs_truncate_block instead of opencoding it Btrfs: noinline merge_extent_mapping Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping Btrfs: extent map selftest: dio write vs dio read Btrfs: extent map selftest: buffered write vs dio read Btrfs: add extent map selftests Btrfs: move extent map specific code to extent_map.c Btrfs: add helper for em merge logic Btrfs: fix unexpected EEXIST from btrfs_get_extent Btrfs: fix incorrect block_len in merge_extent_mapping btrfs: Remove unused readahead spinlock ...
2018-01-29Merge tag '4.16-rc-SMB3' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds24-442/+3896
Pull cifs updates from Steve French: "Some fixes for stable, fixed SMB3 DFS support, SMB3 Direct (RDMA) and various bug fixes and cleanup" * tag '4.16-rc-SMB3' of git://git.samba.org/sfrench/cifs-2.6: (60 commits) fs/cifs/cifsacl.c Fixes typo in a comment update internal version number for cifs.ko cifs: add .splice_write CIFS: document tcon/ses/server refcount dance move a few externs to smbdirect.h to eliminate warning CIFS: zero sensitive data when freeing Cleanup some minor endian issues in smb3 rdma CIFS: dump IPC tcon in debug proc file CIFS: use tcon_ipc instead of use_ipc parameter of SMB2_ioctl CIFS: make IPC a regular tcon cifs: remove redundant duplicated assignment of pointer 'node' CIFS: SMBD: work around gcc -Wmaybe-uninitialized warning cifs: Fix autonegotiate security settings mismatch CIFS: SMBD: _smbd_get_connection() can be static CIFS: SMBD: Disable signing on SMB direct transport CIFS: SMBD: Add SMB Direct debug counters CIFS: SMBD: Upper layer performs SMB read via RDMA write through memory registration CIFS: SMBD: Read correct returned data length for RDMA write (SMB read) I/O CIFS: SMBD: Upper layer performs SMB write via RDMA read through memory registration CIFS: SMBD: Implement RDMA memory registration ...
2018-01-29Merge tag 'iversion-v4.16-1' of ↵Linus Torvalds53-153/+525
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull inode->i_version rework from Jeff Layton: "This pile of patches is a rework of the inode->i_version field. We have traditionally incremented that field on every inode data or metadata change. Typically this increment needs to be logged on disk even when nothing else has changed, which is rather expensive. It turns out though that none of the consumers of that field actually require this behavior. The only real requirement for all of them is that it be different iff the inode has changed since the last time the field was checked. Given that, we can optimize away most of the i_version increments and avoid dirtying inode metadata when the only change is to the i_version and no one is querying it. Queries of the i_version field are rather rare, so we can help write performance under many common workloads. This patch series converts existing accesses of the i_version field to a new API, and then converts all of the in-kernel filesystems to use it. The last patch in the series then converts the backend implementation to a scheme that optimizes away a large portion of the metadata updates when no one is looking at it. In my own testing this series significantly helps performance with small I/O sizes. I also got this email for Christmas this year from the kernel test robot (a 244% r/w bandwidth improvement with XFS over DAX, with 4k writes): https://lkml.org/lkml/2017/12/25/8 A few of the earlier patches in this pile are also flowing to you via other trees (mm, integrity, and nfsd trees in particular)". * tag 'iversion-v4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: (22 commits) fs: handle inode->i_version more efficiently btrfs: only dirty the inode in btrfs_update_time if something was changed xfs: avoid setting XFS_ILOG_CORE if i_version doesn't need incrementing fs: only set S_VERSION when updating times if necessary IMA: switch IMA over to new i_version API xfs: convert to new i_version API ufs: use new i_version API ocfs2: convert to new i_version API nfsd: convert to new i_version API nfs: convert to new i_version API ext4: convert to new i_version API ext2: convert to new i_version API exofs: switch to new i_version API btrfs: convert to new i_version API afs: convert to new i_version API affs: convert to new i_version API fat: convert to new i_version API fs: don't take the i_lock in inode_inc_iversion fs: new API for handling inode->i_version ntfs: remove i_version handling ...
2018-01-29Merge tag 'upstream-4.16-rc1' of git://git.infradead.org/linux-ubifsLinus Torvalds12-135/+140
Pull UBI/UBIFS updates from Richard Weinberger: - use the new fscrypt APIs - a fix for a Fastmap issue - other minor bug fixes * tag 'upstream-4.16-rc1' of git://git.infradead.org/linux-ubifs: ubi: block: Fix locking for idr_alloc/idr_remove mtd: ubi: wl: Fix error return code in ubi_wl_init() ubi: Fix copy/paste error in function documentation ubi: Fastmap: Fix typo ubifs: remove error message in ubifs_xattr_get ubi: fastmap: Erase outdated anchor PEBs during attach ubifs: switch to fscrypt_prepare_setattr() ubifs: switch to fscrypt_prepare_lookup() ubifs: switch to fscrypt_prepare_rename() ubifs: switch to fscrypt_prepare_link() ubifs: switch to fscrypt_file_open() ubi: fastmap: Clean up the initialization of pointer p ubi: fastmap: Use kmem_cache_free to deallocate memory ubi: Fix race condition between ubi volume creation and udev mtd: ubi: Use 'max_bad_blocks' to compute bad_peb_limit if available ubifs: Fix uninitialized variable in search_dh_cookie()
2018-01-29Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-blockLinus Torvalds124-4729/+3884
Pull block updates from Jens Axboe: "This is the main pull request for block IO related changes for the 4.16 kernel. Nothing major in this pull request, but a good amount of improvements and fixes all over the map. This contains: - BFQ improvements, fixes, and cleanups from Angelo, Chiara, and Paolo. - Support for SMR zones for deadline and mq-deadline from Damien and Christoph. - Set of fixes for bcache by way of Michael Lyle, including fixes from himself, Kent, Rui, Tang, and Coly. - Series from Matias for lightnvm with fixes from Hans Holmberg, Javier, and Matias. Mostly centered around pblk, and the removing rrpc 1.2 in preparation for supporting 2.0. - A couple of NVMe pull requests from Christoph. Nothing major in here, just fixes and cleanups, and support for command tracing from Johannes. - Support for blk-throttle for tracking reads and writes separately. From Joseph Qi. A few cleanups/fixes also for blk-throttle from Weiping. - Series from Mike Snitzer that enables dm to register its queue more logically, something that's alwways been problematic on dm since it's a stacked device. - Series from Ming cleaning up some of the bio accessor use, in preparation for supporting multipage bvecs. - Various fixes from Ming closing up holes around queue mapping and quiescing. - BSD partition fix from Richard Narron, fixing a problem where we can't mount newer (10/11) FreeBSD partitions. - Series from Tejun reworking blk-mq timeout handling. The previous scheme relied on atomic bits, but it had races where we would think a request had timed out if it to reused at the wrong time. - null_blk now supports faking timeouts, to enable us to better exercise and test that functionality separately. From me. - Kill the separate atomic poll bit in the request struct. After this, we don't use the atomic bits on blk-mq anymore at all. From me. - sgl_alloc/free helpers from Bart. - Heavily contended tag case scalability improvement from me. - Various little fixes and cleanups from Arnd, Bart, Corentin, Douglas, Eryu, Goldwyn, and myself" * 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits) block: remove smart1,2.h nvme: add tracepoint for nvme_complete_rq nvme: add tracepoint for nvme_setup_cmd nvme-pci: introduce RECONNECTING state to mark initializing procedure nvme-rdma: remove redundant boolean for inline_data nvme: don't free uuid pointer before printing it nvme-pci: Suspend queues after deleting them bsg: use pr_debug instead of hand crafted macros blk-mq-debugfs: don't allow write on attributes with seq_operations set nvme-pci: Fix queue double allocations block: Set BIO_TRACE_COMPLETION on new bio during split blk-throttle: use queue_is_rq_based block: Remove kblockd_schedule_delayed_work{,_on}() blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly() lib/scatterlist: Fix chaining support in sgl_alloc_order() blk-throttle: track read and write request individually block: add bdev_read_only() checks to common helpers block: fail op_is_write() requests to read-only partitions blk-throttle: export io_serviced_recursive, io_service_bytes_recursive ...
2018-01-29Merge tag 'edac_for_4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds6-1/+357
Pull EDAC updates from Borislav Petkov: - new EDAC driver for some TI SOCs (Tero Kristo) - small cleanups * tag 'edac_for_4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, mv64x60: Fix an error handling path EDAC, ti: Add support for TI keystone and DRA7xx EDAC EDAC, octeon: Fix an uninitialized variable warning
2018-01-29Merge tag 'regmap-v4.16' of ↵Linus Torvalds10-25/+229
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap updates from Mark Brown: "A very busy release for regmap, all fairly specialist stuff but useful: - Support for disabling locking from Bartosz Golaszewski, allowing users that handle their own locking to save some overhead. - Support for hwspinlocks in syscons in MFD from Baolin Wang, this is going through the regmap tree since the first users turned up some some cases that needed interface tweaks with 0 being used as a syscon identifier. - Support for devices with no read or write flag from Andrew F. Davis. - Basic support for devices on SoundWire buses from Vinod Koul" * tag 'regmap-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: mfd: syscon: Add hardware spinlock support regmap: Allow empty read/write_flag_mask regcache: flat: Un-inline index lookup from cache access regmap: Add SoundWire bus support regmap: Add one flag to indicate if a hwlock should be used regmap: debugfs: document why we don't create the debugfs entries regmap: debugfs: emit a debug message when locking is disabled regmap: use proper part of work_buf for storing val regmap: potentially duplicate the name string stored in regmap regmap: Disable debugfs when locking is disabled regmap: rename regmap_lock_unlock_empty() to regmap_lock_unlock_none() regmap: allow to disable all locking mechanisms regmap: Remove the redundant config to select hwspinlock
2018-01-29Merge tag 'regulator-v4.16' of ↵Linus Torvalds14-231/+697
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator updates from Mark Brown: "This is a quiet release in terms of code volume but a fairly big one in terms of framework changes - we've got one long awaited feature in the form of runtime configuration of suspend and the start of coupled regulator support too: - Support for modifying the voltage and enable configuration devices will have in suspend, contributed by Chunyan Zhang. - Support for the Spreadtrum SC2731, contributed by Erick Chen. - The start of changes to support coupled regulators from Maciej Purski, the rest of the series should arrive for v4.17" * tag 'regulator-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: Fix build error regulator: core: Refactor regulator_list_voltage() regulator: core: Move of_find_regulator_by_node() to of_regulator.c regulator: add PM suspend and resume hooks regulator: empty the old suspend functions regulator: leave one item to record whether regulator is enabled regulator: make regulator voltage be an array to support more states regulator: added support for suspend states regulator: qcom_spmi: Use regmap helpers for enable/disable/is_enabled callback regulator: sc2731: Fix defines for SC2731_WR_UNLOCK and SC2731_PWR_WR_PROT_VALUE regulator: fix incorrect indentation of two assignment statements regulator: sc2731: Add regulator driver to support Spreadtrum SC2731 PMIC regulator: Add Spreadtrum SC2731 regulator documentation regulator: Update code examples in documentation MAINTAINERS: regulator: Add Documentation/power/regulator/ regulator: tps65218: Add NULL test for devm_kzalloc call regulator: tps65218: Remove unused enum tps65218_regulators