summaryrefslogtreecommitdiffstats
path: root/drivers/virtio
AgeCommit message (Collapse)AuthorFilesLines
2021-03-18Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2-6/+3
Pull virtio fixes from Michael Tsirkin: "Some fixes and cleanups all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost-vdpa: set v->config_ctx to NULL if eventfd_ctx_fdget() fails vhost-vdpa: fix use-after-free of v->config_ctx vhost: Fix vhost_vq_reset() vhost_vdpa: fix the missing irq_bypass_unregister_producer() invocation vdpa_sim: Skip typecasting from void* virtio: remove export for virtio_config_{enable, disable} virtio-mmio: Use to_virtio_mmio_device() to simply code vdpa: set the virtqueue num during register
2021-03-14virtio: remove export for virtio_config_{enable, disable}Xianting Tian1-4/+2
virtio_config_enable(), virtio_config_disable() are only used inside drivers/virtio/virtio.c, so it doesn't need export the symbols. Signed-off-by: Xianting Tian <xianting_tian@126.com> Link: https://lore.kernel.org/r/1613838498-8791-1-git-send-email-xianting_tian@126.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-03-14virtio-mmio: Use to_virtio_mmio_device() to simply codeTang Bin1-2/+1
The file virtio_mmio.c has defined the function to_virtio_mmio_device, so use it instead of container_of() to simply code. Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20210222055724.220-1-tangbin@cmss.chinamobile.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-26Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-15/+28
Merge more updates from Andrew Morton: "118 patches: - The rest of MM. Includes kfence - another runtime memory validator. Not as thorough as KASAN, but it has unmeasurable overhead and is intended to be usable in production builds. - Everything else Subsystems affected by this patch series: alpha, procfs, sysctl, misc, core-kernel, MAINTAINERS, lib, bitops, checkpatch, init, coredump, seq_file, gdb, ubsan, initramfs, and mm (thp, cma, vmstat, memory-hotplug, mlock, rmap, zswap, zsmalloc, cleanups, kfence, kasan2, and pagemap2)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits) MIPS: make userspace mapping young by default initramfs: panic with memory information ubsan: remove overflow checks kgdb: fix to kill breakpoints on initmem after boot scripts/gdb: fix list_for_each x86: fix seq_file iteration for pat/memtype.c seq_file: document how per-entry resources are managed. fs/coredump: use kmap_local_page() init/Kconfig: fix a typo in CC_VERSION_TEXT help text init: clean up early_param_on_off() macro init/version.c: remove Version_<LINUX_VERSION_CODE> symbol checkpatch: do not apply "initialise globals to 0" check to BPF progs checkpatch: don't warn about colon termination in linker scripts checkpatch: add kmalloc_array_node to unnecessary OOM message check checkpatch: add warning for avoiding .L prefix symbols in assembly files checkpatch: improve TYPECAST_INT_CONSTANT test message checkpatch: prefer ftrace over function entry/exit printks checkpatch: trivial style fixes checkpatch: ignore warning designated initializers using NR_CPUS checkpatch: improve blank line after declaration test ...
2021-02-26virtio-mem: check against mhp_get_pluggable_range() which memory we can hotplugDavid Hildenbrand1-14/+27
Right now, we only check against MAX_PHYSMEM_BITS - but turns out there are more restrictions of which memory we can actually hotplug, especially om arm64 or s390x once we support them: we might receive something like -E2BIG or -ERANGE from add_memory_driver_managed(), stopping device operation. So, check right when initializing the device which memory we can add, warning the user. Try only adding actually pluggable ranges: in the worst case, no memory provided by our device is pluggable. In the usual case, we expect all device memory to be pluggable, and in corner cases only some memory at the end of the device-managed memory region to not be pluggable. Link: https://lkml.kernel.org/r/1612149902-7867-5-git-send-email-anshuman.khandual@arm.com Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26mm/memory_hotplug: MEMHP_MERGE_RESOURCE -> MHP_MERGE_RESOURCEDavid Hildenbrand1-1/+1
Let's make "MEMHP_MERGE_RESOURCE" consistent with "MHP_NONE", "mhp_t" and "mhp_flags". As discussed recently [1], "mhp" is our internal acronym for memory hotplug now. [1] https://lore.kernel.org/linux-mm/c37de2d0-28a1-4f7d-f944-cfd7d81c334d@redhat.com/ Link: https://lkml.kernel.org/r/20210126115829.10909-1-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Wei Liu <wei.liu@kernel.org> Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-23virtio-input: add multi-touch supportMathias Crombez1-1/+10
Without multi-touch slots allocated, ABS_MT_SLOT events will be lost by input_handle_abs_event. Implementation is based on uinput_create_device. Signed-off-by: Mathias Crombez <mathias.crombez@faurecia.com> Co-developed-by: Vasyl Vavrychuk <vasyl.vavrychuk@opensynergy.com> Signed-off-by: Vasyl Vavrychuk <vasyl.vavrychuk@opensynergy.com> Link: https://lore.kernel.org/r/20210115002623.8576-1-vasyl.vavrychuk@opensynergy.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2021-02-23virtio_mmio: fix one typoXianting Tian1-1/+1
fix the typo 'there is are' to 'there are'. Signed-off-by: Xianting Tian <xianting_tian@126.com> Link: https://lore.kernel.org/r/1612615619-8128-1-git-send-email-xianting_tian@126.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio_input: Prevent EV_MSC/MSC_TIMESTAMP loop storm for MT.Colin Xu1-0/+15
In 'commit 29cc309d8bf1 ("HID: hid-multitouch: forward MSC_TIMESTAMP")', EV_MSC/MSC_TIMESTAMP is added to each before EV_SYN event. EV_MSC is configured as INPUT_PASS_TO_ALL. In case of a touch device which report MSC_TIMESTAMP: BE pass EV_MSC/MSC_TIMESTAMP to FE on receiving event from evdev. FE pass EV_MSC/MSC_TIMESTAMP back to BE. BE writes EV_MSC/MSC_TIMESTAMP to evdev due to INPUT_PASS_TO_ALL. BE receives extra EV_MSC/MSC_TIMESTAMP and pass to FE. >>> Each new frame becomes larger and larger. Disable EV_MSC/MSC_TIMESTAMP forwarding for MT. V2: Rebase. Signed-off-by: Colin Xu <colin.xu@intel.com> Link: https://lore.kernel.org/r/20210202001923.6227-1-colin.xu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2021-02-23virtio_vdpa: don't warn when fail to disable vqJason Wang1-2/+1
There's no guarantee that the device can disable a specific virtqueue through set_vq_ready(). One example is the modern virtio-pci device. So this patch removes the warning. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-19-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci: introduce modern device moduleJason Wang5-643/+610
Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-17-jasowang@redhat.com Including a bugfix: virtio: don't prompt CONFIG_VIRTIO_PCI_MODERN Cc: Arnd Bergmann <arnd@arndb.de> Cc: Anders Roxell <anders.roxell@linaro.org> Cc: Guenter Roeck <linux@roeck-us.net> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: 86b87c9d858b6 ("virtio-pci: introduce modern device module") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20210223061905.422659-2-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virito-pci-modern: rename map_capability() to vp_modern_map_capability()Jason Wang1-16/+30
To ease the split, map_capability() was renamed to vp_modern_map_capability(). While at it, add the comments for the arguments and switch to use virtio_pci_modern_device as the first parameter. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-16-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: introduce helper to get notification offsetJason Wang1-5/+16
This patch introduces help to get notification offset of modern device. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-15-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: introduce helper for getting queue numsJason Wang1-1/+12
This patch introduces helper for getting queue num of modern device. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-14-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: introduce helper for setting/geting queue sizeJason Wang1-2/+32
This patch introduces helper for setting/getting queue size for modern device. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-13-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: introduce helper to set/get queue_enableJason Wang1-6/+31
This patch introduces a helper to set/get queue_enable for modern device. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-12-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: introduce vp_modern_queue_address()Jason Wang1-6/+27
This patch introduce a helper to set virtqueue address for modern address. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-11-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: introduce vp_modern_set_queue_vector()Jason Wang1-12/+23
This patch introduces a helper to set virtqueue MSI vector. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-10-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: introduce vp_modern_generation()Jason Wang1-3/+14
This patch introduces vp_modern_generation() to get device generation. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-9-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: introduce helpers for setting and getting featuresJason Wang1-10/+33
This patch introduces helpers for setting and getting features. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-8-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: introduce helpers for setting and getting statusJason Wang1-8/+29
This patch introduces helpers to allow set and get device status. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-7-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: introduce helper to set config vectorJason Wang1-2/+14
This patch introduces vp_modern_config_vector() for setting config vector. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-6-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: introduce vp_modern_remove()Jason Wang1-2/+12
This patch introduces vp_modern_remove() doing device resources cleanup to make it can be used. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-5-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci-modern: factor out modern device initialization logicJason Wang1-14/+36
This patch factors out the modern device initialization logic into a helper. Note that it still depends on the caller to enable pci device which allows the caller to use e.g devres. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-4-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci: split out modern deviceJason Wang2-79/+105
This patch splits out the virtio-pci modern device only attributes into another structure. While at it, a dedicated probe method for modern only attributes is introduced. This may help for split the logic into a dedicated module. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-3-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-pci: do not access iomem via struct virtio_pci_device directlyJason Wang1-30/+46
Instead of accessing iomem via struct virito_pci_device directly, tweak to call the io accessors through the iomem structure. This will ease the splitting of modern virtio device logic. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210104065503.199631-2-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23virtio-mem: Assign boolean values to a bool variableJiapeng Zhong1-1/+1
Fix the following coccicheck warnings: ./drivers/virtio/virtio_mem.c:2580:2-25: WARNING: Assignment of 0/1 to bool variable. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Zhong <abaci-bugfix@linux.alibaba.com> Link: https://lore.kernel.org/r/1611129031-82818-1-git-send-email-abaci-bugfix@linux.alibaba.com Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-24Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2-506/+1291
Pull virtio updates from Michael Tsirkin: - vdpa sim refactoring - virtio mem: Big Block Mode support - misc cleanus, fixes * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (61 commits) vdpa: Use simpler version of ida allocation vdpa: Add missing comment for virtqueue count uapi: virtio_ids: add missing device type IDs from OASIS spec uapi: virtio_ids.h: consistent indentions vhost scsi: fix error return code in vhost_scsi_set_endpoint() virtio_ring: Fix two use after free bugs virtio_net: Fix error code in probe() virtio_ring: Cut and paste bugs in vring_create_virtqueue_packed() tools/virtio: add barrier for aarch64 tools/virtio: add krealloc_array tools/virtio: include asm/bug.h vdpa/mlx5: Use write memory barrier after updating CQ index vdpa: split vdpasim to core and net modules vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iov vdpa_sim: make vdpasim->buffer size configurable vdpa_sim: use kvmalloc to allocate vdpasim->buffer vdpa_sim: set vringh notify callback vdpa_sim: add set_config callback in vdpasim_dev_attr vdpa_sim: add get_config callback in vdpasim_dev_attr vdpa_sim: make 'config' generic and usable for any device type ...
2020-12-18virtio_ring: Fix two use after free bugsDan Carpenter1-2/+2
The "vq" struct is added to the "vdev->vqs" list prematurely. If we encounter an error later in the function then the "vq" is freed, but since it is still on the list that could lead to a use after free bug. Fixes: cbeedb72b97a ("virtio_ring: allocate desc state for split ring separately") Reported-by: Robert Buhren <robert.buhren@sect.tu-berlin.de> Reported-by: Felicitas Hetzelt <file@sect.tu-berlin.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/X8pGaG/zkI3jk8mk@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2020-12-18virtio_ring: Cut and paste bugs in vring_create_virtqueue_packed()Dan Carpenter1-2/+2
There is a copy and paste bug in the error handling of this code and it uses "ring_dma_addr" three times instead of "device_event_dma_addr" and "driver_event_dma_addr". Fixes: 1ce9e6055fa0 (" virtio_ring: introduce packed ring support") Reported-by: Robert Buhren <robert.buhren@sect.tu-berlin.de> Reported-by: Felicitas Hetzelt <file@sect.tu-berlin.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/X8pGRJlEzyn+04u2@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2020-12-18virtio-mem: Big Block Mode (BBM) - safe memory hotunplugDavid Hildenbrand1-2/+95
Let's add a safe mechanism to unplug memory, avoiding long/endless loops when trying to offline memory - similar to in SBM. Fake-offline all memory (via alloc_contig_range()) before trying to offline+remove it. Use this mode as default, but allow to enable the other mode explicitly (which could give better memory hotunplug guarantees in some environments). The "unsafe" mode can be enabled e.g., via virtio_mem.bbm_safe_unplug=0 on the cmdline. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-30-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: Big Block Mode (BBM) - basic memory hotunplugDavid Hildenbrand1-1/+155
Let's try to unplug completely offline big blocks first. Then, (if enabled via unplug_offline) try to offline and remove whole big blocks. No locking necessary - we can deal with concurrent onlining/offlining just fine. Note1: This is sub-optimal and might be dangerous in some environments: we could end up in an infinite loop when offlining (e.g., long-term pinnings), similar as with DIMMs. We'll introduce safe memory hotunplug via fake-offlining next, and use this basic mode only when explicitly enabled. Note2: Without ZONE_MOVABLE, memory unplug will be extremely unreliable with bigger block sizes. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-29-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: allow to force Big Block Mode (BBM) and set the big block sizeDavid Hildenbrand1-3/+28
Let's allow to force BBM, even if subblocks would be possible. Take care of properly calculating the first big block id, because the start address might no longer be aligned to the big block size. Also, allow to manually configure the size of Big Blocks. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-27-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: Big Block Mode (BBM) memory hotplugDavid Hildenbrand1-119/+441
Currently, we do not support device block sizes that exceed the Linux memory block size. For example, having a device block size of 1 GiB (e.g., gigantic pages in the hypervisor) won't work with 128 MiB Linux memory blocks. Let's implement Big Block Mode (BBM), whereby we add/remove at least one Linux memory block at a time. With a 1 GiB device block size, a Big Block (BB) will cover 8 Linux memory blocks. We'll keep registering the online_page_callback machinery, it will be used for safe memory hotunplug in BBM next. Note: BBM is properly prepared for variable-sized Linux memory blocks that we might see in the future. So we won't care how many Linux memory blocks a big block actually spans, and how the memory notifier is called. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-26-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: factor out adding/removing memory from LinuxDavid Hildenbrand1-34/+73
Let's use wrappers for the low-level functions that dev_dbg/dev_warn and work on addr + size, such that we can reuse them for adding/removing in other granularity. We only warn when adding memory failed, because that's something to pay attention to. We won't warn when removing failed, we'll reuse that in racy context soon (and we do have proper BUG_ON() statements in the current cases where it must never happen). Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-25-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: memory notifier callbacks are specific to Sub Block Mode (SBM)David Hildenbrand1-14/+15
Let's rename accordingly. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-24-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virito-mem: existing (un)plug functions are specific to Sub Block Mode (SBM)David Hildenbrand1-47/+43
Let's rename them accordingly. virtio_mem_plug_request() and virtio_mem_unplug_request() will be handled separately. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-23-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: memory block ids are specific to Sub Block Mode (SBM)David Hildenbrand1-23/+23
Let's move first_mb_id/next_mb_id/last_usable_mb_id accordingly. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-22-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: nb_sb_per_mb and subblock_size are specific to Sub Block Mode (SBM)David Hildenbrand1-48/+48
Let's rename to "sbs_per_mb" and "sb_size" and move accordingly. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-21-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virito-mem: subblock states are specific to Sub Block Mode (SBM)David Hildenbrand1-63/+69
Let's rename and move accordingly. While at it, rename sb_bitmap to "sb_states". Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-20-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: memory block states are specific to Sub Block Mode (SBM)David Hildenbrand1-106/+109
let's use a new "sbm" sub-struct to hold SBM-specific state and rename + move applicable definitions, functions, and variables (related to memory block states). While at it: - Drop the "_STATE" part from memory block states - Rename "nb_mb_state" to "mb_count" - "set_mb_state" / "get_mb_state" vs. "mb_set_state" / "mb_get_state" - Don't use lengthy "enum virtio_mem_smb_mb_state", simply use "uint8_t" Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-19-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virito-mem: document Sub Block Mode (SBM)David Hildenbrand1-0/+15
Let's add some documentation for the current mode - Sub Block Mode (SBM) - to prepare for a new mode - Big Block Mode (BBM). Follow-up patches will properly factor out the existing Sub Block Mode (SBM) and implement Big Block Mode (BBM). Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-18-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: generalize handling when memory is getting onlined deferredDavid Hildenbrand1-32/+63
We don't want to add too much memory when it's not getting onlined immediately, to avoid running OOM. Generalize the handling, to avoid making use of memory block states. Use a threshold of 1 GiB for now. Properly adjust the offline size when adding/removing memory. As we are not always protected by a lock when touching the offline size, use an atomic64_t. We don't care about races (e.g., someone offlining memory while we are adding more), only about consistent values. (1 GiB needs a memmap of ~16MiB - which sounds reasonable even for setups with little boot memory and (possibly) one virtio-mem device per node) We don't want to retrigger when onlining is caused immediately by our action (e.g., adding memory which immediately gets onlined), so use a flag to indicate if the workqueue is active and use that as an indicator whether to trigger a retry. This will also be especially relevant for Big Block Mode (BBM), whereby we might re-online memory in case offlining of another memory block failed. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-17-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: don't always trigger the workqueue when offlining memoryDavid Hildenbrand1-12/+28
Let's trigger from offlining code only when we're not allowed to unplug online memory. Handle the other case (memmap possibly freeing up another memory block) when actually removing memory. We now also properly handle the case when removing already offline memory blocks via virtio_mem_mb_remove(). When removing via virtio_mem_remove(), when unloading the driver, virtio_mem_retry() is a NOP and safe to use. While at it, move retry handling when offlining out of virtio_mem_notify_offline(), to share it with Big Block Mode (BBM) soon. This is a preparation for Big Block Mode (BBM), whereby we can see some temporary offlining of memory blocks without actually making progress. Imagine you have a Big Block that spans to Linux memory blocks. Assume the first Linux memory blocks has no unmovable data on it. When we would call offline_and_remove_memory() on the big block, we would 1. Try to offline the first block. Works, notifiers triggered. virtio_mem_retry() called. 2. Try to offline the second block. Does not work. 3. Re-online first block. 4. Exit to main loop, exit workqueue. 5. Retry immediately (due to virtio_mem_retry()), go to 1. The result are endless retries. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-16-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: drop last_mb_idDavid Hildenbrand1-4/+0
No longer used, let's drop it. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-15-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: generalize virtio_mem_overlaps_range()David Hildenbrand1-7/+3
Avoid using memory block ids. While at it, use uint64_t for address/size. This is a preparation for Big Block Mode (BBM). Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-14-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: generalize virtio_mem_owned_mb()David Hildenbrand1-4/+5
Avoid using memory block ids. Rename it to virtio_mem_contains_range(). This is a preparation for Big Block Mode (BBM). Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-13-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: generalize check for added memoryDavid Hildenbrand1-4/+15
Let's check by traversing busy system RAM resources instead, to avoid relying on memory block states. Don't use walk_system_ram_range(), as that works on pages and we want to use the bare addresses we have easily at hand. This is a preparation for Big Block Mode (BBM), which won't have memory block states. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-12-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: retry fake-offlining via alloc_contig_range() on ZONE_MOVABLEDavid Hildenbrand1-11/+26
ZONE_MOVABLE is supposed to give some guarantees, yet, alloc_contig_range() isn't prepared to properly deal with some racy cases properly (e.g., temporary page pinning when exiting processed, PCP). Retry 5 times for now. There is certainly room for improvement in the future. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-11-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18virtio-mem: factor out handling of fake-offline pages in memory notifierDavid Hildenbrand1-23/+50
Let's factor out the core pieces and place the implementation next to virtio_mem_fake_offline(). We'll reuse this functionality soon. Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-10-david@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>