summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-07-26powerpc/powernv/sriov: De-indent setup and teardownOliver O'Halloran2-41/+49
Remove the IODA2 PHB checks. We already assume IODA2 in several places so there's not much point in wrapping most of the setup and teardown process in an if block. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-12-oohall@gmail.com
2020-07-26powerpc/powernv/sriov: Drop iov->pe_num_map[]Oliver O'Halloran2-88/+28
Currently the iov->pe_num_map[] does one of two things depending on whether single PE mode is being used or not. When it is, this contains an array which maps a vf_index to the corresponding PE number. When single PE mode is not being used this contains a scalar which is the base PE for the set of enabled VFs (for for VFn is base + n). The array was necessary because when calling pnv_ioda_alloc_pe() there is no guarantee that the allocated PEs would be contigious. We can now allocate contigious blocks of PEs so this is no longer an issue. This allows us to drop the if (single_mode) {} .. else {} block scattered through the SR-IOV code which is a nice clean up. This also fixes a bug in pnv_pci_sriov_disable() which is the non-atomic bitmap_clear() to manipulate the PE allocation map. Other users of the map assume it will be accessed with atomic ops. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-11-oohall@gmail.com
2020-07-26powerpc/powernv/pci: Refactor pnv_ioda_alloc_pe()Oliver O'Halloran2-9/+34
Rework the PE allocation logic to allow allocating blocks of PEs rather than individually. We'll use this to allocate contigious blocks of PEs for the SR-IOVs. This patch also adds code to pnv_ioda_alloc_pe() and pnv_ioda_reserve_pe() to use the existing, but unused, phb->pe_alloc_mutex. Currently these functions use atomic bit ops to release a currently allocated PE number. However, the pnv_ioda_alloc_pe() wants to have exclusive access to the bit map while scanning for hole large enough to accomodate the allocation size. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-10-oohall@gmail.com
2020-07-26powerpc/powernv/sriov: Factor out M64 BAR setupOliver O'Halloran1-29/+100
The sequence required to use the single PE BAR mode is kinda janky and requires a little explanation. The API was designed with P7-IOC style windows where the setup process is something like: 1. Configure the window start / end address 2. Enable the window 3. Map the segments of each window to the PE For Single PE BARs the process is: 1. Set the PE for segment zero on a disabled window 2. Set the range 3. Enable the window Move the OPAL calls into their own helper functions where the quirks can be contained. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-9-oohall@gmail.com
2020-07-26powerpc/powernv/sriov: Simplify used window trackingOliver O'Halloran2-36/+22
No need for the multi-dimensional arrays, just use a bitmap. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-8-oohall@gmail.com
2020-07-26powerpc/powernv/sriov: Rename truncate_iovOliver O'Halloran1-5/+6
This prevents SR-IOV being used by making the SR-IOV BAR resources unallocatable. Rename it to reflect what it actually does. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-7-oohall@gmail.com
2020-07-26powerpc/powernv/sriov: Explain how SR-IOV works on PowerNVOliver O'Halloran1-0/+130
SR-IOV support on PowerNV is a byzantine maze of hooks. I have no idea how anyone is supposed to know how it works except through a lot of stuffering. Write up some docs about the overall story to help out the next sucker^Wperson who needs to tinker with it. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-6-oohall@gmail.com
2020-07-26powerpc/powernv/sriov: Move SR-IOV into a separate fileOliver O'Halloran5-655/+736
pci-ioda.c is getting a bit unwieldly due to the amount of stuff jammed in there. The SR-IOV support can be extracted easily enough and is mostly standalone, so move it into a separate file. This patch also moves the PowerNV SR-IOV specific fields from pci_dn and moves them into a platform specific structure. I'm not sure how they ended up in there in the first place, but leaking platform specifics into common code has proven to be a terrible idea so far so lets stop doing that. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-5-oohall@gmail.com
2020-07-26powerpc/powernv/pci: Initialise M64 for IODA1 as a 1-1 windowOliver O'Halloran1-30/+25
We pre-configure the m64 window for IODA1 as a 1-1 segment-PE mapping, similar to PHB3. Currently the actual mapping of segments occurs in pnv_ioda_pick_m64_pe(), but we can move it into pnv_ioda1_init_m64() and drop the IODA1 specific code paths in the PE setup / teardown. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-4-oohall@gmail.com
2020-07-26powerpc/powernv/pci: Add explicit tracking of the DMA setup stateOliver O'Halloran2-19/+36
There's an optimisation in the PE setup which skips performing DMA setup for a PE if we only have bridges in a PE. The assumption being that only "real" devices will DMA to system memory, which is probably fair. However, if we start off with only bridge devices in a PE then add a non-bridge device the new device won't be able to use DMA because we never configured it. Fix this (admittedly pretty weird) edge case by tracking whether we've done the DMA setup for the PE or not. If a non-bridge device is added to the PE (via rescan or hotplug, or whatever) we can set up DMA on demand. This also means the only remaining user of the old "DMA Weight" code is the IODA1 DMA setup code that it was originally added for, which is good. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-3-oohall@gmail.com
2020-07-26powerpc/powernv/pci: Always tear down DMA windows on PE releaseOliver O'Halloran1-27/+3
Currently we have these two functions: pnv_pci_ioda2_release_dma_pe(), and pnv_pci_ioda2_release_pe_dma() The first is used when tearing down VF PEs and the other is used for normal devices. There's very little difference between the two though. The latter (non-VF) will skip a call to pnv_pci_ioda2_unset_window() unless CONFIG_IOMMU_API=y is set. There's no real point in doing this so fold the two together. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-2-oohall@gmail.com
2020-07-26powerpc/powernv/pci: Add pci_bus_to_pnvhb() helperOliver O'Halloran3-74/+38
Add a helper to go from a pci_bus structure to the pnv_phb that hosts that bus. There's a lot of instances of the following pattern: struct pci_controller *hose = pci_bus_to_host(pdev->bus); struct pnv_phb *phb = hose->private_data; Without any other uses of the pci_controller inside the function. This is hard to read since it requires you to memorise the contents of the private data fields and kind of error prone since it involves blindly assigning a void pointer. Add a helper to make it more concise and explicit. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722065715.1432738-1-oohall@gmail.com
2020-07-26powerpc/eeh: Move PE tree setup into the platformOliver O'Halloran4-64/+103
The EEH core has a concept of a "PE tree" to support PowerNV. The PE tree follows the PCI bus structures because a reset asserted on an upstream bridge will be propagated to the downstream bridges. On pseries there's a 1-1 correspondence between what the guest sees are a PHB and a PE so the "tree" is really just a single node. Current the EEH core is reponsible for setting up this PE tree which it does by traversing the pci_dn tree. The structure of the pci_dn tree matches the bus tree on PowerNV which leads to the PE tree being "correct" this setup method doesn't make a whole lot of sense and it's actively confusing for the pseries case where it doesn't really do anything. We want to remove the dependence on pci_dn anyway so this patch move choosing where to insert a new PE into the platform code rather than being part of the generic EEH code. For PowerNV this simplifies the tree building logic and removes the use of pci_dn. For pseries we keep the existing logic. I'm not really convinced it does anything due to the 1-1 PE-to-PHB correspondence so every device under that PHB should be in the same PE, but I'd rather not remove it entirely until we've had a chance to look at it more deeply. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-14-oohall@gmail.com
2020-07-26powerpc/eeh: Drop pdn use in eeh_pe_tree_insert()Oliver O'Halloran1-8/+7
This is mostly just to make the subsequent diffs less noisy. No functional changes. One thing that needs calling out is the removal of the "config_addr" variable and replacing it with edev->bdfn. The contents of edev->bdfn are the same, however it's worth pointing out that what RTAS calls a "config_addr" isn't the same as the bdfn. The config_addr is supposed to be: <bus><devfn><reg> with each field being an 8 bit number. Various parts of the EEH code use BDFN and "config_addr" as interchangeable quantities even though they aren't really. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-13-oohall@gmail.com
2020-07-26powerpc/eeh: Rename eeh_{add_to|remove_from}_parent_pe()Oliver O'Halloran7-15/+15
The naming of eeh_{add_to|remove_from}_parent_pe() doesn't really reflect what they actually do. If the PE referred to be edev->pe_config_addr already exists under that PHB then the edev is added to that PE. However, if the PE doesn't exist the a new one is created for the edev. The bulk of the implementation of eeh_add_to_parent_pe() covers that second case. Similarly, most of eeh_remove_from_parent_pe() is determining when it's safe to delete a PE. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-12-oohall@gmail.com
2020-07-26powerpc/eeh: Remove class code field from edevOliver O'Halloran3-6/+3
The edev->class_code field is never referenced anywhere except for the platform specific probe functions. The same information is available in the pci_dev for PowerNV and in the pci_dn on pseries so we can remove the field. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-11-oohall@gmail.com
2020-07-26powerpc/eeh: Remove spurious use of pci_dn in eeh_dump_dev_logOliver O'Halloran1-10/+4
Retrieve the domain, bus, device, and function numbers from the edev. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-10-oohall@gmail.com
2020-07-26powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config()Oliver O'Halloran5-64/+68
Mechanical conversion of the eeh_ops interfaces to use eeh_dev to reference a specific device rather than pci_dn. No functional changes. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-9-oohall@gmail.com
2020-07-26powerpc/eeh: Pass eeh_dev to eeh_ops->resume_notify()Oliver O'Halloran4-7/+5
Mechanical conversion of the eeh_ops interfaces to use eeh_dev to reference a specific device rather than pci_dn. No functional changes. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-8-oohall@gmail.com
2020-07-26powerpc/eeh: Pass eeh_dev to eeh_ops->restore_config()Oliver O'Halloran4-12/+7
Mechanical conversion of the eeh_ops interfaces to use eeh_dev to reference a specific device rather than pci_dn. No functional changes. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-7-oohall@gmail.com
2020-07-26powerpc/eeh: Remove VF config space restorationOliver O'Halloran4-99/+7
There's a bunch of strange things about this code. First up is that none of the fields being written to are functional for a VF. The SR-IOV specification lists then as "Reserved, but OS should preserve" so writing new values to them doesn't do anything and is clearly wrong from a correctness perspective. However, since VFs are designed to be managed by the OS there is an argument to be made that we should be saving and restoring some parts of config space. We already sort of do that by saving the first 64 bytes of config space in the eeh_dev (see eeh_dev->config_space[]). This is inadequate since it doesn't even consider saving and restoring the PCI capability structures. However, this is a problem with EEH in general and that needs to be fixed for non-VF devices too. There's no real reason to keep around this around so delete it. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-6-oohall@gmail.com
2020-07-26powerpc/eeh: Kill off eeh_ops->get_pe_addr()Oliver O'Halloran3-25/+11
This is used in precisely one place which is in pseries specific platform code. There's no need to have the callback in eeh_ops since the platform chooses the EEH PE addresses anyway. The PowerNV implementation has always been a stub too so remove it. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-5-oohall@gmail.com
2020-07-26powerpc/pseries: Stop using pdn->pe_numberOliver O'Halloran1-6/+4
The pci_dn->pe_number field is mainly used to track the IODA PE number of a device on PowerNV. At some point it grew a user in the pseries SR-IOV support which muddies the waters a bit, so remove it. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-4-oohall@gmail.com
2020-07-26powerpc/eeh: Move vf_index out of pci_dn and into eeh_devOliver O'Halloran4-8/+9
Drivers that do not support the PCI error handling callbacks are handled by tearing down the device and re-probing them. If the device being removed is a virtual function then we need to know the VF index so it can be removed using the pci_iov_{add|remove}_virtfn() API. Currently this is handled by looking up the pci_dn, and using the vf_index that was stashed there when the pci_dn for the VF was created in pcibios_sriov_enable(). We would like to eliminate the use of pci_dn outside of pseries though so we need to provide the generic EEH code with some other way to find the vf_index. The easiest thing to do here is move the vf_index field out of pci_dn and into eeh_dev. Currently pci_dn and eeh_dev are allocated and initialized together so this is a fairly minimal change in preparation for splitting pci_dn and eeh_dev in the future. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-3-oohall@gmail.com
2020-07-26powerpc/eeh: Remove eeh_dev.cOliver O'Halloran4-61/+21
The only thing in this file is eeh_dev_init() which is allocates and initialises an eeh_dev based on a pci_dn. This is only ever called from pci_dn.c so move it into there and remove the file. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-2-oohall@gmail.com
2020-07-26powerpc/eeh: Remove eeh_dev_phb_init_dynamic()Oliver O'Halloran5-18/+5
This function is a one line wrapper around eeh_phb_pe_create() and despite the name it doesn't create any eeh_dev structures. Replace it with direct calls to eeh_phb_pe_create() since that does what it says on the tin and removes a layer of indirection. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200725081231.39076-1-oohall@gmail.com
2020-07-26powerpc/watchpoint: Remove 512 byte boundaryRavi Bangoria1-2/+3
Power10 has removed 512 bytes boundary from match criteria i.e. the watch range can cross 512 bytes boundary. Note: ISA 3.1 Book III 9.4 match criteria includes 512 byte limit but that is a documentation mistake and hopefully will be fixed in the next version of ISA. Though, ISA 3.1 change log mentions about removal of 512B boundary: Multiple DEAW: Added a second Data Address Watchpoint. [H]DAR is set to the first byte of overlap. 512B boundary is removed. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200723090813.303838-11-ravi.bangoria@linux.ibm.com
2020-07-26powerpc/watchpoint: Return available watchpoints dynamicallyRavi Bangoria2-3/+6
So far Book3S Powerpc supported only one watchpoint. Power10 is introducing 2nd DAWR. Enable 2nd DAWR support for Power10. Availability of 2nd DAWR will depend on CPU_FTR_DAWR1. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200723090813.303838-10-ravi.bangoria@linux.ibm.com
2020-07-26powerpc/watchpoint: Guest support for 2nd DAWR hcallRavi Bangoria5-4/+13
2nd DAWR can be set/unset using H_SET_MODE hcall with resource value 5. Enable powervm guest support with that. This has no effect on kvm guest because kvm will return error if guest does hcall with resource value 5. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200723090813.303838-9-ravi.bangoria@linux.ibm.com
2020-07-26powerpc/watchpoint: Rename current H_SET_MODE DAWR macroRavi Bangoria3-3/+3
Current H_SET_MODE hcall macro name for setting/resetting DAWR0 is H_SET_MODE_RESOURCE_SET_DAWR. Add suffix 0 to macro name as well. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200723090813.303838-8-ravi.bangoria@linux.ibm.com
2020-07-26powerpc/watchpoint: Set CPU_FTR_DAWR1 based on pa-features bitRavi Bangoria1-0/+2
As per the PAPR, bit 0 of byte 64 in pa-features property indicates availability of 2nd DAWR registers. i.e. If this bit is set, 2nd DAWR is present, otherwise not. Host generally uses "cpu-features", which masks "pa-features". But "cpu-features" are still not used for guests and thus this change is mostly applicable for guests only. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200723090813.303838-7-ravi.bangoria@linux.ibm.com
2020-07-26powerpc/dt_cpu_ftrs: Add feature for 2nd DAWRRavi Bangoria2-1/+3
Add new device-tree feature for 2nd DAWR. If this feature is present, 2nd DAWR is supported, otherwise not. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200723090813.303838-6-ravi.bangoria@linux.ibm.com
2020-07-26powerpc/watchpoint: Enable watchpoint functionality on power10 guestRavi Bangoria1-1/+2
CPU_FTR_DAWR is by default enabled for host via CPU_FTRS_DT_CPU_BASE (controlled by CONFIG_PPC_DT_CPU_FTRS). But cpu-features device-tree node is not PAPR compatible and thus not yet used by kvm or pHyp guests. Enable watchpoint functionality on power10 guest (both kvm and powervm) by adding CPU_FTR_DAWR to CPU_FTRS_POWER10. Note that this change does not enable 2nd DAWR support. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200723090813.303838-5-ravi.bangoria@linux.ibm.com
2020-07-26powerpc/watchpoint: Fix DAWR exception for CACHEOPRavi Bangoria1-1/+20
'ea' returned by analyse_instr() needs to be aligned down to cache block size for CACHEOP instructions. analyse_instr() does not set size for CACHEOP, thus size also needs to be calculated manually. Fixes: 27985b2a640e ("powerpc/watchpoint: Don't ignore extraneous exceptions blindly") Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than one watchpoint") Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200723090813.303838-4-ravi.bangoria@linux.ibm.com
2020-07-26powerpc/watchpoint: Fix DAWR exception constraintRavi Bangoria1-31/+41
Pedro Miraglia Franco de Carvalho noticed that on p8/p9, DAR value is inconsistent with different type of load/store. Like for byte,word etc. load/stores, DAR is set to the address of the first byte of overlap between watch range and real access. But for quadword load/ store it's sometime set to the address of the first byte of real access whereas sometime set to the address of the first byte of overlap. This issue has been fixed in p10. In p10(ISA 3.1), DAR is always set to the address of the first byte of overlap. Commit 27985b2a640e ("powerpc/watchpoint: Don't ignore extraneous exceptions blindly") wrongly assumes that DAR is set to the address of the first byte of overlap for all load/stores on p8/p9 as well. Fix that. With the fix, we now rely on 'ea' provided by analyse_instr(). If analyse_instr() fails, generate event unconditionally on p8/p9, and on p10 generate event only if DAR is within a DAWR range. Note: 8xx is not affected. Fixes: 27985b2a640e ("powerpc/watchpoint: Don't ignore extraneous exceptions blindly") Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than one watchpoint") Reported-by: Pedro Miraglia Franco de Carvalho <pedromfc@br.ibm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200723090813.303838-3-ravi.bangoria@linux.ibm.com
2020-07-26powerpc/watchpoint: Fix 512 byte boundary limitRavi Bangoria1-1/+1
Milton Miller reported that we are aligning start and end address to wrong size SZ_512M. It should be SZ_512. Fix that. While doing this change I also found a case where ALIGN() comparison fails. Within a given aligned range, ALIGN() of two addresses does not match when start address is pointing to the first byte and end address is pointing to any other byte except the first one. But that's not true for ALIGN_DOWN(). ALIGN_DOWN() of any two addresses within that range will always point to the first byte. So use ALIGN_DOWN() instead of ALIGN(). Fixes: e68ef121c1f4 ("powerpc/watchpoint: Use builtin ALIGN*() macros") Reported-by: Milton Miller <miltonm@us.ibm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200723090813.303838-2-ravi.bangoria@linux.ibm.com
2020-07-26powerpc/book3s64/pkey: Disable pkey on POWER6 and beforeAneesh Kumar K.V1-0/+6
POWER6 only supports AMR update via privileged mode (MSR[PR] = 0, SPRN_AMR=29) The PR=1 (userspace) alias for that SPR (SPRN_AMR=13) was only supported from POWER7. Since we don't allow userspace modifying of AMR value we should disable pkey support on P6 and before. The hypervisor will still report pkey support via "ibm,processor-storage-keys". Hence also check for P7 CPU_FTR bit to decide on pkey support. Fixes: f491fe3fb41e ("powerpc/book3s64/pkeys: Simplify the key initialization") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200726132517.399076-1-aneesh.kumar@linux.ibm.com
2020-07-24powerpc/sstep: Fix incorrect CONFIG symbol in scv handlingMichael Ellerman1-1/+1
When I "fixed" the ppc64e build in Nick's recent patch, I typoed the CONFIG symbol, resulting in one that doesn't exist. Fix it to use the correct symbol. Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200724131609.1640533-1-mpe@ellerman.id.au
2020-07-24powerpc/test_emulate_sstep: Fix build errorMichael Ellerman1-0/+1
ppc64_book3e_allmodconfig fails with: arch/powerpc/lib/test_emulate_step.c: In function 'test_pld': arch/powerpc/lib/test_emulate_step.c:113:7: error: implicit declaration of function 'cpu_has_feature' 113 | if (!cpu_has_feature(CPU_FTR_ARCH_31)) { | ^~~~~~~~~~~~~~~ Add an include of cpu_has_feature.h to fix it. Fixes: b6b54b42722a ("powerpc/sstep: Add tests for prefixed integer load/stores") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200724004109.1461709-1-mpe@ellerman.id.au
2020-07-23Merge branch 'scv' support into nextMichael Ellerman20-47/+445
From Nick's cover letter: Linux powerpc new system call instruction and ABI System Call Vectored (scv) ABI ============================== The scv instruction is introduced with POWER9 / ISA3, it comes with an rfscv counter-part. The benefit of these instructions is performance (trading slower SRR0/1 with faster LR/CTR registers, and entering the kernel with MSR[EE] and MSR[RI] left enabled, which can reduce MSR updates. The scv instruction has 128 levels (not enough to cover the Linux system call space). Assignment and advertisement ---------------------------- The proposal is to assign scv levels conservatively, and advertise them with HWCAP feature bits as we add support for more. Linux has not enabled FSCR[SCV] yet, so executing the scv instruction will cause the kernel to log a "SCV facility unavilable" message, and deliver a SIGILL with ILL_ILLOPC to the process. Linux has defined a HWCAP2 bit PPC_FEATURE2_SCV for SCV support, but does not set it. This change allocates the zero level ('scv 0'), advertised with PPC_FEATURE2_SCV, which will be used to provide normal Linux system calls (equivalent to 'sc'). Attempting to execute scv with other levels will cause a SIGILL to be delivered the same as before, but will not log a "SCV facility unavailable" message (because the processor facility is enabled). Calling convention ------------------ The proposal is for scv 0 to provide the standard Linux system call ABI with the following differences from sc convention[1]: - LR is to be volatile across scv calls. This is necessary because the scv instruction clobbers LR. From previous discussion, this should be possible to deal with in GCC clobbers and CFI. - cr1 and cr5-cr7 are volatile. This matches the C ABI and would allow the kernel system call exit to avoid restoring the volatile cr registers (although we probably still would anyway to avoid information leaks). - Error handling: The consensus among kernel, glibc, and musl is to move to using negative return values in r3 rather than CR0[SO]=1 to indicate error, which matches most other architectures, and is closer to a function call. Notes ----- - r0,r4-r8 are documented as volatile in the ABI, but the kernel patch as submitted currently preserves them. This is to leave room for deciding which way to go with these. Some small benefit was found by preserving them[1] but I'm not convinced it's worth deviating from the C function call ABI just for this. Release code should follow the ABI. Previous discussions: https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-April/208691.html https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-April/209268.html [1] https://github.com/torvalds/linux/blob/master/Documentation/powerpc/syscall64-abi.rst [2] https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-April/209263.html
2020-07-23selftests/powerpc: Add test of memcmp at end of pageMichael Ellerman1-18/+22
Update our memcmp selftest, to test the case where we're comparing up to the end of a page and the subsequent page is not mapped. We have to make sure we don't read off the end of the page and cause a fault. We had a bug there in the past, fixed in commit d9470757398a ("powerpc/64: Fix memcmp reading past the end of src/dest"). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200722055315.962391-1-mpe@ellerman.id.au
2020-07-23powerpc/powernv/idle: Exclude mfspr on HID1, 4, 5 on P9 and abovePratik Rajesh Sampat1-3/+3
POWER9 onwards the support for the registers HID1, HID4, HID5 has been receded. Although mfspr on the above registers worked in Power9, In Power10 simulator is unrecognized. Moving their assignment under the check for machines lower than Power9 Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200721153708.89057-4-psampat@linux.ibm.com
2020-07-23powerpc/powernv/idle: Rename pnv_first_spr_loss_level variablePratik Rajesh Sampat1-9/+9
Replace the variable name from using "pnv_first_spr_loss_level" to "deep_spr_loss_state". pnv_first_spr_loss_level is supposed to be the earliest state that has OPAL_PM_LOSE_FULL_CONTEXT set, in other places the kernel uses the "deep" states as terminology. Hence renaming the variable to be coherent to its semantics. Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com> Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200721153708.89057-3-psampat@linux.ibm.com
2020-07-23powerpc/powernv/idle: Replace CPU feature check with PVR checkPratik Rajesh Sampat1-1/+1
The POWER9 idle driver contains implementation-specific details that means it is not suitable to run on any processor that implements ISA v3.0 (e.g., POWER10), so only init the driver when running on a POWER9. Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com> [mpe: Use updated change log from Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200721153708.89057-2-psampat@linux.ibm.com
2020-07-23powerpc/mm/hash64: Remove comment that is no longer validSantosh Sivaraj1-4/+0
hash_low_64.S was removed in commit a43c0eb8364c ("powerpc/mm: Convert 4k insert from asm to C") and flush_hash_page() is no longer called from any assembly routine. Signed-off-by: Santosh Sivaraj <santosh@fossix.org> [mpe: Tweak comment wording] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200721091915.205006-1-santosh@fossix.org
2020-07-23KVM: PPC: Fix typo on H_DISABLE_AND_GET hcallLeonardo Bras3-3/+3
On PAPR+ the hcall() on 0x1B0 is called H_DISABLE_AND_GET, but got defined as H_DISABLE_AND_GETC instead. This define was introduced with a typo in commit <b13a96cfb055> ("[PATCH] powerpc: Extends HCALL interface for InfiniBand usage"), and was later used without having the typo noticed. Signed-off-by: Leonardo Bras <leobras.c@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200707004812.190765-1-leobras.c@gmail.com
2020-07-23powerpc/spufs: Fix the type of ret in spufs_arch_write_noteChristoph Hellwig1-1/+1
Both the ->dump method and snprintf return an int. So switch to an int and properly handle errors from ->dump. Fixes: 5456ffdee666 ("powerpc/spufs: simplify spufs core dumping") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200610085554.5647-1-hch@lst.de
2020-07-23powerpc/powernv: Machine check handler for POWER10Nicholas Piggin4-0/+96
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200702233343.1128026-1-npiggin@gmail.com
2020-07-23powerpc/pseries: PCIE PHB resetWen Xiong1-63/+169
Several device drivers hit EEH(Extended Error handling) when triggering kdump on Pseries PowerVM. This patch implemented a reset of the PHBs in pci general code when triggering kdump. PHB reset stop all PCI transactions from normal kernel. We have tested the patch in several enviroments: - direct slot adapters - adapters under the switch - a VF adapter in PowerVM - a VF adapter/adapter in KVM guest. Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> [mpe: Fix broken whitespace, subject & SOB formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1594651173-32166-1-git-send-email-wenxiong@linux.vnet.ibm.com
2020-07-23powerpc: Select ARCH_HAS_MEMBARRIER_SYNC_CORENicholas Piggin5-3/+22
powerpc return from interrupt and return from system call sequences are context synchronising. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200716013522.338318-1-npiggin@gmail.com