summaryrefslogtreecommitdiffstats
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2017-11-08x86/pvclock: add setter for pvclock_pvti_cpu0_vaJoao Martins1-1/+1
Right now there is only a pvclock_pvti_cpu0_va() which is defined on kvmclock since: commit dac16fba6fc5 ("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap") The only user of this interface so far is kvm. This commit adds a setter function for the pvti page and moves pvclock_pvti_cpu0_va to pvclock, which is a more generic place to have it; and would allow other PV clocksources to use it, such as Xen. While moving pvclock_pvti_cpu0_va into pvclock, rename also this function to pvclock_get_pvti_cpu0_va (including its call sites) to be symmetric with the setter (pvclock_set_pvti_cpu0_va). Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Acked-by: Andy Lutomirski <luto@kernel.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-08ptp_kvm: probe for kvm guest availabilityJoao Martins1-0/+3
In the event of moving pvclock_pvti_cpu0_va() definition to common pvclock code, this function would return a value on non KVM guests. Later on this would fail with a GPF on ptp_kvm_init when running on a Xen guest. Therefore, ptp_kvm_init() should check whether it is running in a KVM guest. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Acked-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-08xen/privcmd: remove unused variable pageidxColin Ian King1-3/+0
Variable pageidx is assigned a value but it is never read, hence it is redundant and can be removed. Cleans up clang warning: drivers/xen/privcmd.c:199:2: warning: Value stored to 'pageidx' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-06xen: select grant interface versionJuergen Gross1-2/+33
Grant v2 will be needed in cases where a frame number in the grant table can exceed 32 bits. For PV guests this is a host feature, while for HVM guests this is a guest feature. So select grant v2 in case frame numbers can be larger than 32 bits and grant v1 else. For testing purposes add a way to specify the grant interface version via a boot parameter. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-06xen: add grant interface version dependent constants to gnttab_opsJuergen Gross1-30/+43
Instead of having multiple variables with constants like grant_table_version or grefs_per_grant_frame add those to struct gnttab_ops and access them just via the gnttab_interface pointer. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-06xen: limit grant v2 interface to the v1 functionalityJuergen Gross1-150/+0
As there is currently no user for sub-page grants or transient grants remove that functionality. This at once makes it possible to switch from grant v2 to grant v1 without restrictions, as there is no loss of functionality other than the limited frame number width related to the switch. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-06xen: re-introduce support for grant v2 interfaceJuergen Gross1-6/+306
The grant v2 support was removed from the kernel with commit 438b33c7145ca8a5131a30c36d8f59bce119a19a ("xen/grant-table: remove support for V2 tables") as the higher memory footprint of v2 grants resulted in less grants being possible for a kernel compared to the v1 grant interface. As machines with more than 16TB of memory are expected to be more common in the near future support of grant v2 is mandatory in order to be able to run a Xen pv domain at any memory location. So re-add grant v2 support basically by reverting above commit. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-03xen/pvcalls: remove redundant check for irq >= 0Colin Ian King1-3/+1
This is a moot point, but irq is always less than zero at the label out_error, so the check for irq >= 0 is redundant and can be removed. Detected by CoverityScan, CID#1460371 ("Logically dead code") Fixes: cb1c7d9bbc87 ("xen/pvcalls: implement connect command") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-03xen/pvcalls: fix unsigned less than zero error checkColin Ian King1-4/+3
The check on bedata->ref is never true because ref is an unsigned integer. Fix this by assigning signed int ret to the return of the call to gnttab_claim_grant_reference so the -ve return can be checked. Detected by CoverityScan, CID#1460358 ("Unsigned compared against 0") Fixes: 219681909913 ("xen/pvcalls: connect to the backend") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-03xen/pvcalls-front: mark expected switch fall-throughGustavo A. R. Silva1-1/+2
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Notice that in this particular case I placed the "fall through" comment on its own line, which is what GCC is expecting to find. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-03xen: xenbus_probe_frontend: mark expected switch fall-throughsGustavo A. R. Silva1-0/+2
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 146562 Addresses-Coverity-ID: 146563 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-02xen/time: do not decrease steal time after live migration on xenDongli Zhang2-7/+72
After guest live migration on xen, steal time in /proc/stat (cpustat[CPUTIME_STEAL]) might decrease because steal returned by xen_steal_lock() might be less than this_rq()->prev_steal_time which is derived from previous return value of xen_steal_clock(). For instance, steal time of each vcpu is 335 before live migration. cpu 198 0 368 200064 1962 0 0 1340 0 0 cpu0 38 0 81 50063 492 0 0 335 0 0 cpu1 65 0 97 49763 634 0 0 335 0 0 cpu2 38 0 81 50098 462 0 0 335 0 0 cpu3 56 0 107 50138 374 0 0 335 0 0 After live migration, steal time is reduced to 312. cpu 200 0 370 200330 1971 0 0 1248 0 0 cpu0 38 0 82 50123 500 0 0 312 0 0 cpu1 65 0 97 49832 634 0 0 312 0 0 cpu2 39 0 82 50167 462 0 0 312 0 0 cpu3 56 0 107 50207 374 0 0 312 0 0 Since runstate times are cumulative and cleared during xen live migration by xen hypervisor, the idea of this patch is to accumulate runstate times to global percpu variables before live migration suspend. Once guest VM is resumed, xen_get_runstate_snapshot_cpu() would always return the sum of new runstate times and previously accumulated times stored in global percpu variables. Comment above HYPERVISOR_suspend() has been removed as it is inaccurate: the call can return an error code (e.g., possibly -EPERM in the future). Similar and more severe issue would impact prior linux 4.8-4.10 as discussed by Michael Las at https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest, which would overflow steal time and lead to 100% st usage in top command for linux 4.8-4.10. A backport of this patch would fix that issue. [boris: added linux/slab.h to driver/xen/time.c, slightly reformatted commit message] References: https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen: introduce a Kconfig option to enable the pvcalls frontendStefano Stabellini2-0/+12
Also add pvcalls-front to the Makefile. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: implement release commandStefano Stabellini2-0/+99
Send PVCALLS_RELEASE to the backend and wait for a reply. Take both in_mutex and out_mutex to avoid concurrent accesses. Then, free the socket. For passive sockets, check whether we have already pre-allocated an active socket for the purpose of being accepted. If so, free that as well. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: implement poll commandStefano Stabellini2-9/+138
For active sockets, check the indexes and use the inflight_conn_req waitqueue to wait. For passive sockets if an accept is outstanding (PVCALLS_FLAG_ACCEPT_INFLIGHT), check if it has been answered by looking at bedata->rsp[req_id]. If so, return POLLIN. Otherwise use the inflight_accept_req waitqueue. If no accepts are inflight, send PVCALLS_POLL to the backend. If we have outstanding POLL requests awaiting for a response use the inflight_req waitqueue: inflight_req is awaken when a new response is received; on wakeup we check whether the POLL response is arrived by looking at the PVCALLS_FLAG_POLL_RET flag. We set the flag from pvcalls_front_event_handler, if the response was for a POLL command. In pvcalls_front_event_handler, get the struct sock_mapping from the poll id (we previously converted struct sock_mapping* to uintptr_t and used it as id). Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: implement recvmsgStefano Stabellini2-0/+115
Implement recvmsg by copying data from the "in" ring. If not enough data is available and the recvmsg call is blocking, then wait on the inflight_conn_req waitqueue. Take the active socket in_mutex so that only one function can access the ring at any given time. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: implement sendmsgStefano Stabellini2-0/+124
Send data to an active socket by copying data to the "out" ring. Take the active socket out_mutex so that only one function can access the ring at any given time. If not enough room is available on the ring, rather than returning immediately or sleep-waiting, spin for up to 5000 cycles. This small optimization turns out to improve performance significantly. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: implement accept commandStefano Stabellini2-0/+148
Introduce a waitqueue to allow only one outstanding accept command at any given time and to implement polling on the passive socket. Introduce a flags field to keep track of in-flight accept and poll commands. Send PVCALLS_ACCEPT to the backend. Allocate a new active socket. Make sure that only one accept command is executed at any given time by setting PVCALLS_FLAG_ACCEPT_INFLIGHT and waiting on the inflight_accept_req waitqueue. Convert the new struct sock_mapping pointer into an uintptr_t and use it as id for the new socket to pass to the backend. Check if the accept call is non-blocking: in that case after sending the ACCEPT command to the backend store the sock_mapping pointer of the new struct and the inflight req_id then return -EAGAIN (which will respond only when there is something to accept). Next time accept is called, we'll check if the ACCEPT command has been answered, if so we'll pick up where we left off, otherwise we return -EAGAIN again. Note that, differently from the other commands, we can use wait_event_interruptible (instead of wait_event) in the case of accept as we are able to track the req_id of the ACCEPT response that we are waiting. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: implement listen commandStefano Stabellini2-0/+58
Send PVCALLS_LISTEN to the backend. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: implement bind commandStefano Stabellini2-0/+69
Send PVCALLS_BIND to the backend. Introduce a new structure, part of struct sock_mapping, to store information specific to passive sockets. Introduce a status field to keep track of the status of the passive socket. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: implement connect commandStefano Stabellini2-0/+160
Send PVCALLS_CONNECT to the backend. Allocate a new ring and evtchn for the active socket. Introduce fields in struct sock_mapping to keep track of active sockets. Introduce a waitqueue to allow the frontend to wait on data coming from the backend on the active socket (recvmsg command). Two mutexes (one of reads and one for writes) will be used to protect the active socket in and out rings from concurrent accesses. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: implement socket command and handle eventsStefano Stabellini2-0/+139
Send a PVCALLS_SOCKET command to the backend, use the masked req_prod_pvt as req_id. This way, req_id is guaranteed to be between 0 and PVCALLS_NR_REQ_PER_RING. We already have a slot in the rsp array ready for the response, and there cannot be two outstanding responses with the same req_id. Wait for the response by waiting on the inflight_req waitqueue and check for the req_id field in rsp[req_id]. Use atomic accesses and barriers to read the field. Note that the barriers are simple smp barriers (as opposed to virt barriers) because they are for internal frontend synchronization, not frontend<->backend communication. Once a response is received, clear the corresponding rsp slot by setting req_id to PVCALLS_INVALID_ID. Note that PVCALLS_INVALID_ID is invalid only from the frontend point of view. It is not part of the PVCalls protocol. pvcalls_front_event_handler is in charge of copying responses from the ring to the appropriate rsp slot. It is done by copying the body of the response first, then by copying req_id atomically. After the copies, wake up anybody waiting on waitqueue. socket_lock protects accesses to the ring. Convert the pointer to sock_mapping into an uintptr_t and use it as id for the new socket to pass to the backend. The struct will be fully initialized later on connect or bind. sock->sk->sk_send_head is not used for ip sockets: reuse the field to store a pointer to the struct sock_mapping corresponding to the socket. This way, we can easily get the struct sock_mapping from the struct socket. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: connect to the backendStefano Stabellini1-0/+132
Implement the probe function for the pvcalls frontend. Read the supported versions, max-page-order and function-calls nodes from xenstore. Only one frontend<->backend connection is supported at any given time for a guest. Store the active frontend device to a static pointer. Introduce a stub functions for the event handler. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: implement frontend disconnectStefano Stabellini1-0/+71
Introduce a data structure named pvcalls_bedata. It contains pointers to the command ring, the event channel, a list of active sockets and a list of passive sockets. Lists accesses are protected by a spin_lock. Introduce a waitqueue to allow waiting for a response on commands sent to the backend. Introduce an array of struct xen_pvcalls_response to store commands responses. Introduce a new struct sock_mapping to keep track of sockets. In this patch the struct sock_mapping is minimal, the fields will be added by the next patches. pvcalls_refcount is used to keep count of the outstanding pvcalls users. Only remove connections once the refcount is zero. Implement pvcalls frontend removal function. Go through the list of active and passive sockets and free them all, one at a time. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31xen/pvcalls: introduce the pvcalls xenbus frontendStefano Stabellini1-0/+61
Introduce a xenbus frontend for the pvcalls protocol, as defined by https://xenbits.xen.org/docs/unstable/misc/pvcalls.html. This patch only adds the stubs, the code will be added by the following patches. Signed-off-by: Stefano Stabellini <stefano@aporeto.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds22-150/+299
Pull networking fixes from David Miller: 1) Fix route leak in xfrm_bundle_create(). 2) In mac80211, validate user rate mask before configuring it. From Johannes Berg. 3) Properly enforce memory limits in fair queueing code, from Toke Hoiland-Jorgensen. 4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet. 5) Fix TSO header allocation and management in mvpp2 driver, from Yan Markman. 6) Don't take socket lock in BH handler in strparser code, from Tom Herbert. 7) Don't show sockets from other namespaces in AF_UNIX code, from Andrei Vagin. 8) Fix double free in error path of tap_open(), from Girish Moodalbail. 9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker and Alexander Duyck. 10) Fix DCB mode programming in stmmac driver, from Jose Abreu. 11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin Long. 12) Properly align SKB head before building SKB in tuntap, from Jason Wang. 13) Avoid matching qdiscs with a zero handle during lookups, from Cong Wang. 14) Fix various endianness bugs in sctp, from Xin Long. 15) Fix tc filter callback races and add selftests which trigger the problem, from Cong Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits) selftests: Introduce a new test case to tc testsuite selftests: Introduce a new script to generate tc batch file net_sched: fix call_rcu() race on act_sample module removal net_sched: add rtnl assertion to tcf_exts_destroy() net_sched: use tcf_queue_work() in tcindex filter net_sched: use tcf_queue_work() in rsvp filter net_sched: use tcf_queue_work() in route filter net_sched: use tcf_queue_work() in u32 filter net_sched: use tcf_queue_work() in matchall filter net_sched: use tcf_queue_work() in fw filter net_sched: use tcf_queue_work() in flower filter net_sched: use tcf_queue_work() in flow filter net_sched: use tcf_queue_work() in cgroup filter net_sched: use tcf_queue_work() in bpf filter net_sched: use tcf_queue_work() in basic filter net_sched: introduce a workqueue for RCU callbacks of tc filter sctp: fix some type cast warnings introduced since very beginning sctp: fix a type cast warnings that causes a_rwnd gets the wrong value sctp: fix some type cast warnings introduced by transport rhashtable sctp: fix some type cast warnings introduced by stream reconf ...
2017-10-28Merge branch 'for-linus' of ↵Linus Torvalds3-9/+14
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - fix gtco tablet driver, tightening parsing of HID descriptors - add ACPI ID added to Elan driver to be able to handle touchpads found in Lenovo Ideapad 320/520 - fix the Symaptics RMI4 driver to adjust handling of buttons * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: synaptics-rmi4 - limit the range of what GPIOs are buttons Input: gtco - fix potential out-of-bound access Input: elan_i2c - add ELAN0611 to the ACPI table
2017-10-28Merge tag 'drm-fixes-for-v4.14-rc7' of ↵Linus Torvalds8-79/+27
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Two amd fixes, one i915 core and a few i915 GVT fixes, things seem fairly quiet" * tag 'drm-fixes-for-v4.14-rc7' of git://people.freedesktop.org/~airlied/linux: drm/i915/gvt: Adding ACTHD mmio read handler drm/i915/gvt: Extract mmio_read_from_hw() common function drm/i915/gvt: Refine MMIO_RING_F() drm/i915/gvt: properly check per_ctx bb valid state drm/i915/perf: fix perf enable/disable ioctls with 32bits userspace drm/amd/amdgpu: Remove workaround check for UVD6 on APUs drm/amd/powerplay: fix uninitialized variable
2017-10-28Merge tag 'scsi-fixes' of ↵Linus Torvalds9-22/+37
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Six fixes for mostly minor issues, most of which have small race windows for occurring" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: Suppress a kernel warning in case the prep function returns BLKPREP_DEFER scsi: sg: Re-fix off by one in sg_fill_request_table() scsi: aacraid: Fix controller initialization failure scsi: hpsa: Fix configured_logical_drive_count·check scsi: qla2xxx: Initialize Work element before requesting IRQs scsi: zfcp: fix erp_action use-before-initialize in REC action trace
2017-10-28tap: reference to KVA of an unloaded module causes kernel panicGirish Moodalbail3-6/+7
The commit 9a393b5d5988 ("tap: tap as an independent module") created a separate tap module that implements tap functionality and exports interfaces that will be used by macvtap and ipvtap modules to create create respective tap devices. However, that patch introduced a regression wherein the modules macvtap and ipvtap can be removed (through modprobe -r) while there are applications using the respective /dev/tapX devices. These applications cause kernel to hold reference to /dev/tapX through 'struct cdev macvtap_cdev' and 'struct cdev ipvtap_dev' defined in macvtap and ipvtap modules respectively. So, when the application is later closed the kernel panics because we are referencing KVA that is present in the unloaded modules. ----------8<------- Example ----------8<---------- $ sudo ip li add name mv0 link enp7s0 type macvtap $ sudo ip li show mv0 |grep mv0| awk -e '{print $1 $2}' 14:mv0@enp7s0: $ cat /dev/tap14 & $ lsmod |egrep -i 'tap|vlan' macvtap 16384 0 macvlan 24576 1 macvtap tap 24576 3 macvtap $ sudo modprobe -r macvtap $ fg cat /dev/tap14 ^C <...system panics...> BUG: unable to handle kernel paging request at ffffffffa038c500 IP: cdev_put+0xf/0x30 ----------8<-----------------8<---------- The fix is to set cdev.owner to the module that creates the tap device (either macvtap or ipvtap). With this set, the operations (in fs/char_dev.c) on char device holds and releases the module through cdev_get() and cdev_put() and will not allow the module to unload prematurely. Fixes: 9a393b5d5988ea4e (tap: tap as an independent module) Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28tuntap: properly align skb->head before building skbJason Wang1-0/+1
An unaligned alloc_frag->offset caused by previous allocation will result an unaligned skb->head. This will lead unaligned skb_shared_info and then unaligned dataref which requires to be aligned for accessing on some architecture. Fix this by aligning alloc_frag->offset before the frag refilling. Fixes: 0bbd7dad34f8 ("tun: make tun_build_skb() thread safe") Cc: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Cc: Wei Wei <dotweiba@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Reported-by: Wei Wei <dotweiba@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27Merge tag 'for-linus-4.14c-rc7-tag' of ↵Linus Torvalds2-7/+14
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - a fix for the Xen gntdev device repairing an issue in case of partial failure of mapping multiple pages of another domain - a fix of a regression in the Xen balloon driver introduced in 4.13 - a build fix for Xen on ARM which will trigger e.g. for Linux RT - a maintainers update for pvops (not really Xen, but carrying through this tree just for convenience) * tag 'for-linus-4.14c-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: maintainers: drop Chris Wright from pvops arm/xen: don't inclide rwlock.h directly. xen: fix booting ballooned down hvm guest xen/gntdev: avoid out of bounds access in case of partial gntdev_mmap()
2017-10-27Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds2-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Ingo Molnar: "Two fixes: an ARM fix for KASLR interaction with hibernation, plus an efi_test crash fix" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/libstub/arm: Don't randomize runtime regions when CONFIG_HIBERNATION=y efi/efi_test: Prevent an Oops in efi_runtime_query_capsulecaps()
2017-10-27Input: synaptics-rmi4 - limit the range of what GPIOs are buttonsAndrew Duggan1-2/+3
By convention the first 6 bits of F30 Ctrl 2 and 3 are used to signify GPIOs which are connected to buttons. Additional GPIOs may be used as input GPIOs to signal the touch controller of some event (ie disable touchpad). These additional GPIOs may meet the criteria of a button in rmi_f30_is_valid_button() but should not be considered buttons. This patch limits the GPIOs which are mapped to buttons to just the first 6. Signed-off-by: Andrew Duggan <aduggan@synaptics.com> Reported-by: Daniel Martin <consume.noise@gmail.com> Tested-by: Daniel Martin <consume.noise@gmail.com> Acked-By: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-10-27Input: gtco - fix potential out-of-bound accessDmitry Torokhov1-7/+10
parse_hid_report_descriptor() has a while (i < length) loop, which only guarantees that there's at least 1 byte in the buffer, but the loop body can read multiple bytes which causes out-of-bounds access. Reported-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-10-28Merge branch '40GbE' of ↵David S. Miller5-21/+22
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2017-10-26 This series contains fixes to e1000, igb, ixgbe and i40e. Vincenzo Maffione fixes a potential race condition which would result in the interface being up but transmits are disabled in the hardware. Colin Ian King fixes a possible NULL pointer dereference in e1000, which was found by Coverity. Jean-Philippe Brucker fixes a possible kernel panic when a driver cannot map a transmit buffer, which is caused by an erroneous test. Alex provides a fix for ixgbe, which is a partial revert of the commit ffed21bcee7a ("ixgbe: Don't bother clearing buffer memory for descriptor rings") because the previous commit messed up the exception handling path by adding the count back in when we did not need to. Also fixed a typo, where the transmit ITR setting was being used to determine if we were using adaptive receive interrupt moderation or not. Lastly, fixed a memory leak by including programming descriptors in the cleaned count. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27net: stmmac: First Queue must always be in DCB modeJose Abreu1-0/+7
According to DWMAC databook the first queue operating mode must always be in DCB. As MTL_QUEUE_DCB = 1, we need to always set the first queue operating mode to DCB otherwise driver will think that queue is in AVB mode (because MTL_QUEUE_AVB = 0). Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27net: stmmac: dwc-qos-eth: Fix typo in DT bindings parsingJose Abreu1-1/+1
According to DT bindings documentation we are expecting a property called "snps,read-requests" but we are parsing instead a property called "read,read-requests". This is clearly a typo. Fix it. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27Merge tag 'mlx5-fixes-2017-10-26' of ↵David S. Miller5-96/+202
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2017-10-26 The series includes some misc fixes for mlx5 core and etherent driver. Please pull and let me know if there's any problem. For -Stable: net/mlx5e: Properly deal with encap flows add/del under neigh update (kernels >= 4.12) net/mlx5: Fix health work queue spin lock to IRQ safe (kernels >= 4.13) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27Merge tag 'drm-intel-fixes-2017-10-26' of ↵Dave Airlie5-65/+15
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes One fix for stable: - fix perf enable/disable ioctls for 32bits (Lionel) Plus GVT fixes: - Fix per_ctx_bb check (Zhenyu) - Fix GPU hang of Linux guest (Xion) - Refine MMIO_RING_F to check for presence of VCS2 ring (Zhi) * tag 'drm-intel-fixes-2017-10-26' of git://anongit.freedesktop.org/drm/drm-intel: drm/i915/gvt: Adding ACTHD mmio read handler drm/i915/gvt: Extract mmio_read_from_hw() common function drm/i915/gvt: Refine MMIO_RING_F() drm/i915/gvt: properly check per_ctx bb valid state
2017-10-26Merge tag 'for-linus' of ↵Linus Torvalds1-1/+12
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fix from Doug Ledford: "Fix an oops issue in the new RDMA netlink code" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: RDMA/netlink: OOPs in rdma_nl_rcv_msg() from misinterpreted flag
2017-10-26drm/i915/gvt: Adding ACTHD mmio read handlerXiong Zhang1-2/+3
When a workload is too heavy to finish it in gpu hang check timer intervals(1.5), gpu hang check function will check ACTHD register value to decide whether gpu is real dead or not. On real hw, ACTHD is updated by HW when workload is running, then host kernel won't think it is gpu hang. while guest kernel always read a constant ACTHD value as GVT doesn't supply ACTHD emulate handler, then guest kernel detects a fake gpu hang. To remove such guest fake gpu hang, this patch supply ACTHD mmio read handler which read real HW ACTHD register directly. Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/b4c9a097-3e62-124e-6856-b0c37764df7b@intel.com
2017-10-27drm/i915/gvt: Extract mmio_read_from_hw() common functionXiong Zhang1-16/+5
The mmio read handler for ring timestmap / instdone register are same as reading hw value directly. Extract it as common function to reduce code duplications. Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
2017-10-27drm/i915/gvt: Refine MMIO_RING_F()Zhi Wang2-45/+2
Inspect if the host has VCS2 ring by host i915 macro in MMIO_RING_F(). Also this helps on reducing some LOCs. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
2017-10-27drm/i915/gvt: properly check per_ctx bb valid stateZhenyu Wang3-2/+5
Need to check valid state for per_ctx bb and bypass batch buffer combine for scan if necessary. Otherwise adding invalid MI batch buffer start cmd for per_ctx bb will cause scan failure, which is taken as -EFAULT now so vGPU would be put in failsafe. This trys to fix that by checking per_ctx bb valid state. Also remove old invalid WARNING that indirect ctx bb shouldn't depend on valid per_ctx bb. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
2017-10-26Merge tag 'pm-4.14-rc7' of ↵Linus Torvalds6-32/+57
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "This fixes a device power management quality of service (PM QoS) framework implementation issue causing 'no restriction' requests for device resume latency, including 'no restriction' set by user space, to effectively override requests with specific device resume latency requirements. It is late in the cycle, but the bug in question is in the 'user space can trigger unexpected behavior' category and the fix is stable-candidate, so here it goes" * tag 'pm-4.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / QoS: Fix device resume latency PM QoS
2017-10-26Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds5-14/+69
Pull block fixes from Jens Axboe: "A few select fixes that should go into this series. Mainly for NVMe, but also a single stable fix for nbd from Josef" * 'for-linus' of git://git.kernel.dk/linux-block: nbd: handle interrupted sendmsg with a sndtimeo set nvme-rdma: Fix error status return in tagset allocation failure nvme-rdma: Fix possible double free in reconnect flow nvmet: synchronize sqhd update nvme-fc: retry initial controller connections 3 times nvme-fc: fix iowait hang
2017-10-26Merge tag 'spi-fix-v4.14-rc5' of ↵Linus Torvalds4-107/+64
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "There are a bunch of device specific fixes (more than I'd like, I've been lax sending these) plus one important core fix for the conversion to use an IDR for bus number allocation which avoids issues with collisions when some but not all of the buses in the system have a fixed bus number specified. The Armada changes are rather large, specificially "spi: armada-3700: Fix padding when sending not 4-byte aligned data", but it's a storage corruption issue and there's things like indentation changes which make it look bigger than it really is. It's been cooking in -next for quite a while now and is part of the reason for the delay" * tag 'spi-fix-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: fix IDR collision on systems with both fixed and dynamic SPI bus numbers spi: bcm-qspi: Fix use after free in bcm_qspi_probe() in error path spi: a3700: Return correct value on timeout detection spi: uapi: spidev: add missing ioctl header spi: stm32: Fix logical error in stm32_spi_prepare_mbr() spi: armada-3700: Fix padding when sending not 4-byte aligned data spi: armada-3700: Fix failing commands with quad-SPI
2017-10-26i40e: Add programming descriptors to cleaned_countAlexander Duyck1-0/+1
This patch updates the i40e driver to include programming descriptors in the cleaned_count. Without this change it becomes possible for us to leak memory as we don't trigger a large enough allocation when the time comes to allocate new buffers and we end up overwriting a number of rx_buffers equal to the number of programming descriptors we encountered. Fixes: 0e626ff7ccbf ("i40e: Fix support for flow director programming status") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Anders K. Pedersen <akp@cohaesio.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26i40e: Fix incorrect use of tx_itr_setting when checking for Rx ITR setupAlexander Duyck1-1/+1
It looks like there was either a copy/paste error or just a typo that resulted in the Tx ITR setting being used to determine if we were using adaptive Rx interrupt moderation or not. This patch fixes the typo. Fixes: 65e87c0398f5 ("i40evf: support queue-specific settings for interrupt moderation") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>