summaryrefslogtreecommitdiffstats
path: root/drivers/s390/cio/vfio_ccw_drv.c
AgeCommit message (Collapse)AuthorFilesLines
2022-12-05vfio/ap/ccw/samples: Fix device_register() unwind pathAlex Williamson1-1/+2
We always need to call put_device() if device_register() fails. All vfio drivers calling device_register() include a similar unwind stack via gotos, therefore split device_unregister() into its device_del() and put_device() components in the unwind path, and add a goto target to handle only the put_device() requirement. Reported-by: Ruan Jinjie <ruanjinjie@huawei.com> Link: https://lore.kernel.org/all/20221118032827.3725190-1-ruanjinjie@huawei.com Fixes: d61fc96f47fd ("sample: vfio mdev display - host device") Fixes: 9d1a546c53b4 ("docs: Sample driver to demonstrate how to use Mediated device framework.") Fixes: a5e6e6505f38 ("sample: vfio bochs vbe display (host device for bochs-drm)") Fixes: 9e6f07cd1eaa ("vfio/ccw: create a parent struct") Fixes: 36360658eb5a ("s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem") Cc: Tony Krowiak <akrowiak@linux.ibm.com> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Jason Herne <jjherne@linux.ibm.com> Cc: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Link: https://lore.kernel.org/r/166999942139.645727.12439756512449846442.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10vfio/ccw: replace vfio_init_device with _alloc_Eric Farman1-18/+0
Now that we have a reasonable separation of structs that follow the subchannel and mdev lifecycles, there's no reason we can't call the official vfio_alloc_device routine for our private data, and behave like everyone else. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-7-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10vfio/ccw: move private to mdev lifecycleEric Farman1-15/+1
Now that the mdev parent data is split out into its own struct, it is safe to move the remaining private data to follow the mdev probe/remove lifecycle. The mdev parent data will remain where it is, and follow the subchannel and the css driver interfaces. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-5-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10vfio/ccw: move private initialization to callbackEric Farman1-65/+9
There's already a device initialization callback that is used to initialize the release completion workaround that was introduced by commit ebb72b765fb49 ("vfio/ccw: Use the new device life cycle helpers"). Move the other elements of the vfio_ccw_private struct that require distinct initialization over to that routine. With that done, the vfio_ccw_alloc_private routine only does a kzalloc, so fold it inline. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-4-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10vfio/ccw: remove private->schEric Farman1-2/+1
These places all rely on the ability to jump from a private struct back to the subchannel struct. Rather than keeping a copy in our back pocket, let's use the relationship provided by the vfio_device embedded within the private. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-3-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10vfio/ccw: create a parent structEric Farman1-18/+80
Move the stuff associated with the mdev parent (and thus the subchannel struct) into its own struct, and leave the rest in the existing private structure. The subchannel will point to the parent, and the parent will point to the private, for the areas where one or both are needed. Further separation of these structs will follow. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-2-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04vfio/mdev: add mdev available instance checking to the coreJason Gunthorpe1-1/+0
Many of the mdev drivers use a simple counter for keeping track of the available instances. Move this code to the core code and store the counter in the mdev_parent. Implement it using correct locking, fixing mdpy. Drivers just provide the value in the mdev_driver at registration time and the core code takes care of maintaining it and exposing the value in sysfs. [hch: count instances per-parent instead of per-type, use an atomic_t to avoid taking mdev_list_lock in the show method] Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Link: https://lore.kernel.org/r/20220923092652.100656-15-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04vfio/mdev: consolidate all the name sysfs into the core codeChristoph Hellwig1-0/+1
Every driver just emits a static string, simply add a field to the mdev_type for the driver to fill out or fall back to the sysfs name and provide a standard sysfs show function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Link: https://lore.kernel.org/r/20220923092652.100656-12-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04vfio/mdev: simplify mdev_type handlingChristoph Hellwig1-2/+4
Instead of abusing struct attribute_group to control initialization of struct mdev_type, just define the actual attributes in the mdev_driver, allocate the mdev_type structures in the caller and pass them to mdev_register_parent. This allows the caller to use container_of to get at the containing structure and thus significantly simplify the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Link: https://lore.kernel.org/r/20220923092652.100656-6-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04vfio/mdev: embedd struct mdev_parent in the parent data structureChristoph Hellwig1-2/+3
Simplify mdev_{un}register_device by requiring the caller to pass in a structure allocate as part of the parent device structure. This removes the need for a list of parents and the separate mdev_parent refcount as we can simplify rely on the reference to the parent device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Link: https://lore.kernel.org/r/20220923092652.100656-5-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04vfio/mdev: make mdev.h standalone includableChristoph Hellwig1-1/+0
Include <linux/device.h> and <linux/uuid.h> so that users of this headers don't need to do that and remove those includes that aren't needed any more. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Link: https://lore.kernel.org/r/20220923092652.100656-4-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-08-01vfio/ccw: Remove FSM Close from remove handlersEric Farman1-1/+0
Now that neither vfio_ccw_sch_probe() nor vfio_ccw_mdev_probe() affect the FSM state, it doesn't make sense for their _remove() counterparts try to revert things in this way. Since the FSM open and close are handled alongside MDEV open/close, these are unnecessary. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220728204914.2420989-3-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Move FSM open/close to MDEV open/closeEric Farman1-8/+3
Part of the confusion that has existed is the FSM lifecycle of subchannels between the common CSS driver and the vfio-ccw driver. During configuration, the FSM state goes from NOT_OPER to STANDBY to IDLE, but then back to NOT_OPER. For example: vfio_ccw_sch_probe: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_probe: VFIO_CCW_STATE_STANDBY vfio_ccw_mdev_probe: VFIO_CCW_STATE_IDLE vfio_ccw_mdev_remove: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_remove: VFIO_CCW_STATE_NOT_OPER vfio_ccw_sch_shutdown: VFIO_CCW_STATE_NOT_OPER Rearrange the open/close events to align with the mdev open/close, to better manage the memory and state of the devices as time progresses. Specifically, make mdev_open() perform the FSM open, and mdev_close() perform the FSM close instead of reset (which is both close and open). This makes the NOT_OPER state a dead-end path, indicating the device is probably not recoverable without fully probing and re-configuring the device. This has the nice side-effect of removing a number of special-cases where the FSM state is managed outside of the FSM itself (such as the aforementioned mdev_close() routine). Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-12-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Create a CLOSE FSM eventEric Farman1-12/+5
Refactor the vfio_ccw_sch_quiesce() routine to extract the bit that disables the subchannel and affects the FSM state. Use this to form the basis of a CLOSE event that will mirror the OPEN event, and move the subchannel back to NOT_OPER state. A key difference with that mirroring is that while OPEN handles the transition from NOT_OPER => STANDBY, the later probing of the mdev handles the transition from STANDBY => IDLE. On the other hand, the CLOSE event will move from one of the operating states {IDLE, CP_PROCESSING, CP_PENDING} => NOT_OPER. That is, there is no stop in a STANDBY state on the deconfigure path. Add a call to cp_free() in this event, such that it is captured for the various permutations of this event. In the unlikely event that cio_disable_subchannel() returns -EBUSY, the remaining logic of vfio_ccw_sch_quiesce() can still be used. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-10-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Create an OPEN FSM EventEric Farman1-7/+2
Move the process of enabling a subchannel for use by vfio-ccw into the FSM, such that it can manage the sequence of lifecycle events for the device. That is, if the FSM state is NOT_OPER(erational), then do the work that would enable the subchannel and move the FSM to STANDBY state. An attempt to perform this event again from any of the other operating states (IDLE, CP_PROCESSING, CP_PENDING) will convert the device back to NOT_OPER so the configuration process can be started again. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-9-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Flatten MDEV device (un)registerEric Farman1-2/+2
The vfio_ccw_mdev_(un)reg routines are merely vfio-ccw routines that pass control to mdev_(un)register_device. Since there's only one caller of each, let's just call the mdev routines directly. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-7-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Do not change FSM state in subchannel eventEric Farman1-11/+3
The routine vfio_ccw_sch_event() is tasked with handling subchannel events, specifically machine checks, on behalf of vfio-ccw. It correctly calls cio_update_schib(), and if that fails (meaning the subchannel is gone) it makes an FSM event call to mark the subchannel Not Operational. If that worked, however, then it decides that if the FSM state was already Not Operational (implying the subchannel just came back), then it should simply change the FSM to partially- or fully-open. Remove this trickery, since a subchannel returning will require more probing than simply "oh all is well again" to ensure it works correctly. Fixes: bbe37e4cb8970 ("vfio: ccw: introduce a finite state machine") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-4-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Fix FSM state if mdev probe failsEric Farman1-2/+3
The FSM is in STANDBY state when arriving in vfio_ccw_mdev_probe(), and this routine converts it to IDLE as part of its processing. The error exit sets it to IDLE (again) but clears the private->mdev pointer. The FSM should of course be managing the state itself, but the correct thing for vfio_ccw_mdev_probe() to do would be to put the state back the way it found it. The corresponding check of private->mdev in vfio_ccw_sch_io_todo() can be removed, since the distinction is unnecessary at this point. Fixes: 3bf1311f351ef ("vfio/ccw: Convert to use vfio_register_emulated_iommu_dev()") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-3-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-07vfio/ccw: Remove UUID from s390 debug logMichael Kawano1-3/+2
As vfio-ccw devices are created/destroyed, the uuid of the associated mdevs that are recorded in $S390DBF/vfio_ccw_msg/sprintf get lost. This is because a pointer to the UUID is stored instead of the UUID itself, and that memory may have been repurposed if/when the logs are examined. The result is usually garbage UUID data in the logs, though there is an outside chance of an oops happening here. Simply remove the UUID from the traces, as the subchannel number will provide useful configuration information for problem determination, and is stored directly into the log instead of a pointer. As we were the only consumer of mdev_uuid(), remove that too. Cc: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Michael Kawano <mkawano@linux.ibm.com> Fixes: 60e05d1cf0875 ("vfio-ccw: add some logging") Fixes: b7701dfbf9832 ("vfio-ccw: Register a chp_event callback for vfio-ccw") [farman: reworded commit message, added Fixes: tags] Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Link: https://lore.kernel.org/r/20220707135737.720765-2-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-12-06s390/cio: remove uevent suppress from cio driverVineeth Vijayan1-5/+0
commit fa1a8c23eb7d ("s390: cio: Delay uevents for subchannels") introduced suppression of uevents for a subchannel until after it is clear that the subchannel would not be unregistered again immediately. This was done to avoid uevents being generated for I/O subchannels with no valid device, which can happen on LPAR. However, this also has some drawbacks: All subchannel drivers need to manually remove the uevent suppression and generate an ADD uevent as soon as they are sure that the subchannel will stay around. This misses out on all uevents that are not the initial ADD uevent that would be generated while uevents are suppressed; for example, all subchannels were missing the BIND uevent. As uevents being generated even for I/O subchannels without an operational device turned out to be not as bad as missing uevents and complicating the code flow, let's remove uevent suppression for subchannels. Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> [cohuck@redhat.com: modified changelog] Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Link: https://lore.kernel.org/r/20211122103756.352463-2-vneethv@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-10-28vfio/ccw: Convert to use vfio_register_emulated_iommu_dev()Jason Gunthorpe1-7/+14
This is a more complicated conversion because vfio_ccw is sharing the vfio_device between both the mdev_device, its vfio_device and the css_driver. The mdev is a singleton, and the reason for this sharing is so the extra css_driver function callbacks to be delivered to the vfio_device implementation. This keeps things as they are, with the css_driver allocating the singleton, not the mdev_driver. Embed the vfio_device in the vfio_ccw_private and instantiate it as a vfio_device when the mdev probes. The drvdata of both the css_device and the mdev_device point at the private, and container_of is used to get it back from the vfio_device. Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v4-cea4f5bd2c00+b52-ccw_mdev_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-10-28vfio/ccw: Use functions for alloc/free of the vfio_ccw_privateJason Gunthorpe1-47/+66
Makes the code easier to understand what is memory lifecycle and what is other stuff. Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v4-cea4f5bd2c00+b52-ccw_mdev_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-10-28vfio/ccw: Remove unneeded GFP_DMAJason Gunthorpe1-1/+1
Since the ccw_io_region was split out of the private the allocation no longer needs the GFP_DMA. Remove it. Reported-by: Christoph Hellwig <hch@infradead.org> Fixes: c98e16b2fa12 ("s390/cio: Convert ccw_io_region to pointer") Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v4-cea4f5bd2c00+b52-ccw_mdev_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-07-21s390/cio: Make struct css_driver::remove return voidUwe Kleine-König1-2/+1
The driver core ignores the return value of css_remove() (because there is only little it can do when a device disappears) and all callbacks return 0 anyhow. So make it impossible for future drivers to return an unused error code by changing the remove prototype to return void. The real motivation for this change is the quest to make struct bus_type::remove return void, too. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20210713193522.1770306-3-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-12vfio-ccw: Serialize FSM IDLE state with I/O completionEric Farman1-2/+10
Today, the stacked call to vfio_ccw_sch_io_todo() does three things: 1) Update a solicited IRB with CP information, and release the CP if the interrupt was the end of a START operation. 2) Copy the IRB data into the io_region, under the protection of the io_mutex 3) Reset the vfio-ccw FSM state to IDLE to acknowledge that vfio-ccw can accept more work. The trouble is that step 3 is (A) invoked for both solicited and unsolicited interrupts, and (B) sitting after the mutex for step 2. This second piece becomes a problem if it processes an interrupt for a CLEAR SUBCHANNEL while another thread initiates a START, thus allowing the CP and FSM states to get out of sync. That is: CPU 1 CPU 2 fsm_do_clear() fsm_irq() fsm_io_request() vfio_ccw_sch_io_todo() fsm_io_helper() Since the FSM state and CP should be kept in sync, let's make a note when the CP is released, and rely on that as an indication that the FSM should also be reset at the end of this routine and open up the device for more work. Signed-off-by: Eric Farman <farman@linux.ibm.com> Acked-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20210511195631.3995081-4-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-03vfio-ccw: Add trace for CRW eventEric Farman1-0/+1
Since CRW events are (should be) rare, let's put a trace in that routine too. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-9-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-03vfio-ccw: Wire up the CRW irq and CRW regionFarhan Ali1-0/+49
Use the IRQ to notify userspace that there is a CRW pending in the region, related to path-availability changes on the passthrough subchannel. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-8-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-03vfio-ccw: Introduce a new CRW regionFarhan Ali1-0/+20
This region provides a mechanism to pass a Channel Report Word that affect vfio-ccw devices, and needs to be passed to the guest for its awareness and/or processing. The base driver (see crw_collect_info()) provides space for two CRWs, as a subchannel event may have two CRWs chained together (one for the ssid, one for the subchannel). As vfio-ccw will deal with everything at the subchannel level, provide space for a single CRW to be transferred in one shot. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-7-farman@linux.ibm.com> [CH: added padding to ccw_crw_region] Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-02vfio-ccw: Introduce a new schib regionFarhan Ali1-0/+20
The schib region can be used by userspace to get the subchannel- information block (SCHIB) for the passthrough subchannel. This can be useful to get information such as channel path information via the SCHIB.PMCW fields. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-5-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-02vfio-ccw: Register a chp_event callback for vfio-ccwFarhan Ali1-0/+47
Register the chp_event callback to receive channel path related events for the subchannels managed by vfio-ccw. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-3-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-06-02vfio-ccw: Introduce new helper functions to free/destroy regionsFarhan Ali1-10/+18
Consolidate some of the cleanup code for the regions, so that as more are added we reduce code duplication. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505122745.53208-2-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2020-04-06s390/cio: generate delayed uevent for vfio-ccw subchannelsCornelia Huck1-0/+5
The common I/O layer delays the ADD uevent for subchannels and delegates generating this uevent to the individual subchannel drivers. The vfio-ccw I/O subchannel driver, however, did not do that, and will not generate an ADD uevent for subchannels that had not been bound to a different driver (or none at all, which also triggers the uevent). Generate the ADD uevent at the end of the probe function if uevents were still suppressed for the device. Message-Id: <20200327124503.9794-3-cohuck@redhat.com> Fixes: 63f1934d562d ("vfio: ccw: basic implementation for vfio_ccw driver") Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-05vfio-ccw: fix error return code in vfio_ccw_sch_init()Wei Yongjun1-2/+6
Fix to return negative error code -ENOMEM from the memory alloc failed error handling case instead of 0, as done elsewhere in this function. Fixes: 60e05d1cf087 ("vfio-ccw: add some logging") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link https://lore.kernel.org/kvm/20190904083315.105600-1-weiyongjun1@huawei.com/ Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-23vfio-ccw: add some loggingCornelia Huck1-3/+47
Usually, the common I/O layer logs various things into the s390 cio debug feature, which has been very helpful in the past when looking at crash dumps. As vfio-ccw devices unbind from the standard I/O subchannel driver, we lose some information there. Let's introduce some vfio-ccw debug features and log some things there. (Unfortunately we cannot reuse the cio debug feature from a module.) Message-Id: <20190816151505.9853-2-cohuck@redhat.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-07-15vfio-ccw: Don't call cp_free if we are processing a channel programFarhan Ali1-1/+1
There is a small window where it's possible that we could be working on an interrupt (queued in the workqueue) and setting up a channel program (i.e allocating memory, pinning pages, translating address). This can lead to allocating and freeing the channel program at the same time and can cause memory corruption. Let's not call cp_free if we are currently processing a channel program. The only way we know for sure that we don't have a thread setting up a channel program is when the state is set to VFIO_CCW_STATE_CP_PENDING. Fixes: d5afd5d135c8 ("vfio-ccw: add handling for async channel instructions") Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <62e87bf67b38dc8d5760586e7c96d400db854ebe.1562854091.git.alifm@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-07-08Merge tag 's390-5.3-1' of ↵Linus Torvalds1-3/+10
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Improve stop_machine wait logic: replace cpu_relax_yield call in generic stop_machine function with a weak stop_machine_yield function. This is overridden on s390, which yields the current cpu to the neighbouring cpu after a couple of retries, instead of blindly giving up the cpu to the hipervisor. This significantly improves stop_machine performance on s390 in overcommitted scenarios. This includes common code changes which have been Acked by Peter Zijlstra and Thomas Gleixner. - Improve jump label transformation speed: transform jump labels without using stop_machine. - Refactoring of the vfio-ccw cp handling, simplifying the code and avoiding unneeded allocating/copying. - Various vfio-ccw fixes (ccw translation, state machine). - Add support for vfio-ap queue interrupt control in the guest. This includes s390 kvm changes which have been Acked by Christian Borntraeger. - Add protected virtualization support for virtio-ccw. - Enforce both CONFIG_SMP and CONFIG_HOTPLUG_CPU, which allows to remove some code which most likely isn't working at all, besides that s390 didn't even compile for !CONFIG_SMP. - Support for special flagged EP11 CPRBs for zcrypt. - Handle PCI devices with no support for new MIO instructions. - Avoid KASAN false positives in reworked stack unwinder. - Couple of fixes for the QDIO layer. - Convert s390 specific documentation to ReST format. - Let s390 crypto modules return -ENODEV instead of -EOPNOTSUPP if hardware is missing. This way our modules behave like most other modules and which is also what systemd's systemd-modules-load.service expects. - Replace defconfig with performance_defconfig, so there is one config file less to maintain. - Remove the SCLP call home device driver, which was never useful. - Cleanups all over the place. * tag 's390-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits) docs: s390: s390dbf: typos and formatting, update crash command docs: s390: unify and update s390dbf kdocs at debug.c docs: s390: restore important non-kdoc parts of s390dbf.rst vfio-ccw: Fix the conversion of Format-0 CCWs to Format-1 s390/pci: correctly handle MIO opt-out s390/pci: deal with devices that have no support for MIO instructions s390: ap: kvm: Enable PQAP/AQIC facility for the guest s390: ap: implement PAPQ AQIC interception in kernel vfio: ap: register IOMMU VFIO notifier s390: ap: kvm: add PQAP interception for AQIC s390/unwind: cleanup unused READ_ONCE_TASK_STACK s390/kasan: avoid false positives during stack unwind s390/qdio: don't touch the dsci in tiqdio_add_input_queues() s390/qdio: (re-)initialize tiqdio list entries s390/dasd: Fix a precision vs width bug in dasd_feature_list() s390/cio: introduce driver_override on the css bus vfio-ccw: make convert_ccw0_to_ccw1 static vfio-ccw: Remove copy_ccw_from_iova() vfio-ccw: Factor out the ccw0-to-ccw1 transition vfio-ccw: Copy CCW data outside length calculation ...
2019-06-21vfio-ccw: Move guest_cp storage into common structEric Farman1-0/+7
Rather than allocating/freeing a piece of memory every time we try to figure out how long a CCW chain is, let's use a piece of memory allocated for each device. The io_mutex added with commit 4f76617378ee9 ("vfio-ccw: protect the I/O region") is held for the duration of the VFIO_CCW_EVENT_IO_REQ event that accesses/uses this space, so there should be no race concerns with another CPU attempting an (unexpected) SSCH for the same device. Suggested-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20190618202352.39702-2-farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-06-13vfio-ccw: Destroy kmem cache region on module exitFarhan Ali1-0/+1
Free the vfio_ccw_cmd_region on module exit. Fixes: d5afd5d135c8 ("vfio-ccw: add handling for async channel instructions") Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Message-Id: <c0f39039d28af39ea2939391bf005e3495d890fd.1559576250.git.alifm@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-03s390/cio: Set vfio-ccw FSM state before ioeventfdEric Farman1-3/+3
Otherwise, the guest can believe it's okay to start another I/O and bump into the non-idle state. This results in a cc=2 (with the asynchronous CSCH/HSCH code) returned to the guest, which is unfortunate since everything is otherwise working normally. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Message-Id: <20190514234248.36203-3-farman@linux.ibm.com> Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-04-24vfio-ccw: Prevent quiesce function going into an infinite loopFarhan Ali1-14/+18
The quiesce function calls cio_cancel_halt_clear() and if we get an -EBUSY we go into a loop where we: - wait for any interrupts - flush all I/O in the workqueue - retry cio_cancel_halt_clear During the period where we are waiting for interrupts or flushing all I/O, the channel subsystem could have completed a halt/clear action and turned off the corresponding activity control bits in the subchannel status word. This means the next time we call cio_cancel_halt_clear(), we will again start by calling cancel subchannel and so we can be stuck between calling cancel and halt forever. Rather than calling cio_cancel_halt_clear() immediately after waiting, let's try to disable the subchannel. If we succeed in disabling the subchannel then we know nothing else can happen with the device. Suggested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Message-Id: <4d5a4b98ab1b41ac6131b5c36de18b76c5d66898.1555449329.git.alifm@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-04-24vfio-ccw: Do not call flush_workqueue while holding the spinlockFarhan Ali1-1/+1
Currently we call flush_workqueue while holding the subchannel spinlock. But flush_workqueue function can go to sleep, so do not call the function while holding the spinlock. Fixes the following bug: [ 285.203430] BUG: scheduling while atomic: bash/14193/0x00000002 [ 285.203434] INFO: lockdep is turned off. .... [ 285.203485] Preemption disabled at: [ 285.203488] [<000003ff80243e5c>] vfio_ccw_sch_quiesce+0xbc/0x120 [vfio_ccw] [ 285.203496] CPU: 7 PID: 14193 Comm: bash Tainted: G W .... [ 285.203504] Call Trace: [ 285.203510] ([<0000000000113772>] show_stack+0x82/0xd0) [ 285.203514] [<0000000000b7a102>] dump_stack+0x92/0xd0 [ 285.203518] [<000000000017b8be>] __schedule_bug+0xde/0xf8 [ 285.203524] [<0000000000b95b5a>] __schedule+0x7a/0xc38 [ 285.203528] [<0000000000b9678a>] schedule+0x72/0xb0 [ 285.203533] [<0000000000b9bfbc>] schedule_timeout+0x34/0x528 [ 285.203538] [<0000000000b97608>] wait_for_common+0x118/0x1b0 [ 285.203544] [<0000000000166d6a>] flush_workqueue+0x182/0x548 [ 285.203550] [<000003ff80243e6e>] vfio_ccw_sch_quiesce+0xce/0x120 [vfio_ccw] [ 285.203556] [<000003ff80245278>] vfio_ccw_mdev_reset+0x38/0x70 [vfio_ccw] [ 285.203562] [<000003ff802458b0>] vfio_ccw_mdev_remove+0x40/0x78 [vfio_ccw] [ 285.203567] [<000003ff801a499c>] mdev_device_remove_ops+0x3c/0x80 [mdev] [ 285.203573] [<000003ff801a4d5c>] mdev_device_remove+0xc4/0x130 [mdev] [ 285.203578] [<000003ff801a5074>] remove_store+0x6c/0xa8 [mdev] [ 285.203582] [<000000000046f494>] kernfs_fop_write+0x14c/0x1f8 [ 285.203588] [<00000000003c1530>] __vfs_write+0x38/0x1a8 [ 285.203593] [<00000000003c187c>] vfs_write+0xb4/0x198 [ 285.203597] [<00000000003c1af2>] ksys_write+0x5a/0xb0 [ 285.203601] [<0000000000b9e270>] system_call+0xdc/0x2d8 Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Message-Id: <626bab8bb2958ae132452e1ddaf1b20882ad5a9d.1554756534.git.alifm@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-04-24vfio-ccw: add handling for async channel instructionsCornelia Huck1-13/+33
Add a region to the vfio-ccw device that can be used to submit asynchronous I/O instructions. ssch continues to be handled by the existing I/O region; the new region handles hsch and csch. Interrupt status continues to be reported through the same channels as for ssch. Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-04-24vfio-ccw: protect the I/O regionCornelia Huck1-0/+3
Introduce a mutex to disallow concurrent reads or writes to the I/O region. This makes sure that the data the kernel or user space see is always consistent. The same mutex will be used to protect the async region as well. Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2019-03-11vfio: ccw: only free cp on final interruptCornelia Huck1-2/+6
When we get an interrupt for a channel program, it is not necessarily the final interrupt; for example, the issuing guest may request an intermediate interrupt by specifying the program-controlled-interrupt flag on a ccw. We must not switch the state to idle if the interrupt is not yet final; even more importantly, we must not free the translated channel program if the interrupt is not yet final, or the host can crash during cp rewind. Fixes: e5f84dbaea59 ("vfio: ccw: return I/O results asynchronously") Cc: stable@vger.kernel.org # v4.12+ Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-11-13vfio: ccw: Register mediated device once all structures are initializedPierre Morel1-4/+4
Let's register the mediated device when all the data structures which could be used are initialized. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Message-Id: <1540487720-11634-3-git-send-email-pmorel@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-11-13s390/cio: make vfio_ccw_io_region staticSebastian Ott1-1/+1
Fix the following sparse warning: drivers/s390/cio/vfio_ccw_drv.c:25:19: warning: symbol 'vfio_ccw_io_region' was not declared. Should it be static? Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Message-Id: <alpine.LFD.2.21.1810151328570.1636@schleppi.aag-de.ibmmobiledemo.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-09-27s390/cio: Refactor alloc of ccw_io_regionEric Farman1-4/+16
If I attach a vfio-ccw device to my guest, I get the following warning on the host when the host kernel is CONFIG_HARDENED_USERCOPY=y [250757.595325] Bad or missing usercopy whitelist? Kernel memory overwrite attempt detected to SLUB object 'dma-kmalloc-512' (offset 64, size 124)! [250757.595365] WARNING: CPU: 2 PID: 10958 at mm/usercopy.c:81 usercopy_warn+0xac/0xd8 [250757.595369] Modules linked in: kvm vhost_net vhost tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c devlink tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables sunrpc dm_multipath s390_trng crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha1_s390 eadm_sch tape_3590 tape tape_class qeth_l2 qeth ccwgroup vfio_ccw vfio_mdev zcrypt_cex4 mdev vfio_iommu_type1 zcrypt vfio sha256_s390 sha_common zfcp scsi_transport_fc qdio dasd_eckd_mod dasd_mod [250757.595424] CPU: 2 PID: 10958 Comm: CPU 2/KVM Not tainted 4.18.0-derp #2 [250757.595426] Hardware name: IBM 3906 M05 780 (LPAR) ...snip regs... [250757.595523] Call Trace: [250757.595529] ([<0000000000349210>] usercopy_warn+0xa8/0xd8) [250757.595535] [<000000000032daaa>] __check_heap_object+0xfa/0x160 [250757.595540] [<0000000000349396>] __check_object_size+0x156/0x1d0 [250757.595547] [<000003ff80332d04>] vfio_ccw_mdev_write+0x74/0x148 [vfio_ccw] [250757.595552] [<000000000034ed12>] __vfs_write+0x3a/0x188 [250757.595556] [<000000000034f040>] vfs_write+0xa8/0x1b8 [250757.595559] [<000000000034f4e6>] ksys_pwrite64+0x86/0xc0 [250757.595568] [<00000000008959a0>] system_call+0xdc/0x2b0 [250757.595570] Last Breaking-Event-Address: [250757.595573] [<0000000000349210>] usercopy_warn+0xa8/0xd8 While vfio_ccw_mdev_{write|read} validates that the input position/count does not run over the ccw_io_region struct, the usercopy code that does copy_{to|from}_user doesn't necessarily know this. It sees the variable length and gets worried that it's affecting a normal kmalloc'd struct, and generates the above warning. Adjust how the ccw_io_region is alloc'd with a whitelist to remove this warning. The boundary checking will continue to do its thing. Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20180921204013.95804-3-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-09-27s390/cio: Convert ccw_io_region to pointerEric Farman1-1/+11
In the event that we want to change the layout of the ccw_io_region in the future[1], it might be easier to work with it as a pointer within the vfio_ccw_private struct rather than an embedded struct. [1] https://patchwork.kernel.org/comment/22228541/ Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20180921204013.95804-2-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-05-29vfio: ccw: fix error return in vfio_ccw_sch_eventDong Jia Shi1-1/+4
If the device has not been registered, or there is work pending, we should reschedule a sch_event call again. Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Message-Id: <20180502072559.50691-1-bjsdjshi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-11-24s390: cio: add SPDX identifiers to the remaining filesGreg Kroah-Hartman1-0/+1
It's good to have SPDX identifiers in all files to make it easier to audit the kernel tree for correct licenses. Update the drivers/s390/cio/ files with the correct SPDX license identifier based on the license text in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This work is based on a script and data from Thomas Gleixner, Philippe Ombredanne, and Kate Stewart. Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>