linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2020-09-18	iommu/amd: Fix potential @entry null deref	Joao Martins	1	-1/+3
	After commit 26e495f34107 ("iommu/amd: Restore IRTE.RemapEn bit after programming IRTE"), smatch warns: drivers/iommu/amd/iommu.c:3870 amd_iommu_deactivate_guest_mode() warn: variable dereferenced before check 'entry' (see line 3867) Fix this by moving the @valid assignment to after @entry has been checked for NULL. Fixes: 26e495f34107 ("iommu/amd: Restore IRTE.RemapEn bit after programming IRTE") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20200910171621.12879-1-joao.m.martins@oracle.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-18	iommu/mediatek: Add support for MT8167	Fabien Parent	2	-0/+9
	Add support for the IOMMU on MT8167 Signed-off-by: Fabien Parent <fparent@baylibre.com> Reviewed-by: Yong Wu <yong.wu@mediatek.com> Link: https://lore.kernel.org/r/20200907101649.1573134-3-fparent@baylibre.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-18	iommu/mediatek: Add flag for legacy ivrp paddr	Fabien Parent	1	-2/+4
	Add a new flag in order to select which IVRP_PADDR format is used by an SoC. Signed-off-by: Fabien Parent <fparent@baylibre.com> Reviewed-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20200907101649.1573134-2-fparent@baylibre.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-17	x86/mmu: Allocate/free a PASID	Fenghua Yu	1	-1/+27
	A PASID is allocated for an "mm" the first time any thread binds to an SVA-capable device and is freed from the "mm" when the SVA is unbound by the last thread. It's possible for the "mm" to have different PASID values in different binding/unbinding SVA cycles. The mm's PASID (non-zero for valid PASID or 0 for invalid PASID) is propagated to a per-thread PASID MSR for all threads within the mm through IPI, context switch, or inherited. This is done to ensure that a running thread has the right PASID in the MSR matching the mm's PASID. [ bp: s/SVM/SVA/g; massage. ] Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-10-git-send-email-fenghua.yu@intel.com
2020-09-17	iommu/vt-d: Change flags type to unsigned int in binding mm	Fenghua Yu	1	-3/+4
	"flags" passed to intel_svm_bind_mm() is a bit mask and should be defined as "unsigned int" instead of "int". Change its type to "unsigned int". Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1600187413-163670-3-git-send-email-fenghua.yu@intel.com
2020-09-17	drm, iommu: Change type of pasid to u32	Fenghua Yu	9	-70/+71
	PASID is defined as a few different types in iommu including "int", "u32", and "unsigned int". To be consistent and to match with uapi definitions, define PASID and its variations (e.g. max PASID) as "u32". "u32" is also shorter and a little more explicit than "unsigned int". No PASID type change in uapi although it defines PASID as __u64 in some places. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com
2020-09-17	dma-mapping: introduce DMA range map, supplanting dma_pfn_offset	Jim Quinlan	1	-1/+1
	The new field 'dma_range_map' in struct device is used to facilitate the use of single or multiple offsets between mapping regions of cpu addrs and dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only capable of holding a single uniform offset and had no region bounds checking. The function of_dma_get_range() has been modified so that it takes a single argument -- the device node -- and returns a map, NULL, or an error code. The map is an array that holds the information regarding the DMA regions. Each range entry contains the address offset, the cpu_start address, the dma_start address, and the size of the region. of_dma_configure() is the typical manner to set range offsets but there are a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel driver code. These cases now invoke the function dma_direct_set_offset(dev, cpu_addr, dma_addr, size). Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> [hch: various interface cleanups] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com>
2020-09-16	iommu/amd: Remove domain search for PCI/MSI	Thomas Gleixner	1	-3/+0
	Now that the domain can be retrieved through device::msi_domain the domain search for PCI_MSI[X] is not longer required. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200826112334.400700807@linutronix.de
2020-09-16	iommu/vt-d: Remove domain search for PCI/MSI[X]	Thomas Gleixner	1	-3/+0
	Now that the domain can be retrieved through device::msi_domain the domain search for PCI_MSI[X] is not longer required. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112334.305699301@linutronix.de
2020-09-16	iommm/amd: Store irq domain in struct device	Thomas Gleixner	1	-1/+16
	As the next step to make X86 utilize the direct MSI irq domain operations store the irq domain pointer in the device struct when a device is probed. It only overrides the irqdomain of devices which are handled by a regular PCI/MSI irq domain which protects PCI devices behind special busses like VMD which have their own irq domain. No functional change. It just avoids the redirection through arch_*_msi_irqs() and allows the PCI/MSI core to directly invoke the irq domain alloc/free functions instead of having to look up the irq domain for every single MSI interupt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112333.806328762@linutronix.de
2020-09-16	iommm/vt-d: Store irq domain in struct device	Thomas Gleixner	2	-0/+19
	As a first step to make X86 utilize the direct MSI irq domain operations store the irq domain pointer in the device struct when a device is probed. This is done from dmar_pci_bus_add_dev() because it has to work even when DMA remapping is disabled. It only overrides the irqdomain of devices which are handled by a regular PCI/MSI irq domain which protects PCI devices behind special busses like VMD which have their own irq domain. No functional change. It just avoids the redirection through arch_*_msi_irqs() and allows the PCI/MSI core to directly invoke the irq domain alloc/free functions instead of having to look up the irq domain for every single MSI interupt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112333.714566121@linutronix.de
2020-09-16	x86/msi: Consolidate MSI allocation	Thomas Gleixner	2	-4/+5
	Convert the interrupt remap drivers to retrieve the pci device from the msi descriptor and use info::hwirq. This is the first step to prepare x86 for using the generic MSI domain ops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112332.466405395@linutronix.de
2020-09-16	x86_ioapic_Consolidate_IOAPIC_allocation	Thomas Gleixner	3	-17/+17
	Move the IOAPIC specific fields into their own struct and reuse the common devid. Get rid of the #ifdeffery as it does not matter at all whether the alloc info is a couple of bytes longer or not. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112332.054367732@linutronix.de
2020-09-16	x86/msi: Consolidate HPET allocation	Thomas Gleixner	2	-3/+3
	None of the magic HPET fields are required in any way. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.943993771@linutronix.de
2020-09-16	iommu/irq_remapping: Consolidate irq domain lookup	Thomas Gleixner	5	-30/+4
	Now that the iommu implementations handle the X86_*_GET_PARENT_DOMAIN types, consolidate the two getter functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.741909337@linutronix.de
2020-09-16	iommu/amd: Consolidate irq domain getter	Thomas Gleixner	1	-44/+21
	The irq domain request mode is now indicated in irq_alloc_info::type. Consolidate the two getter functions into one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.634777249@linutronix.de
2020-09-16	iommu/vt-d: Consolidate irq domain getter	Thomas Gleixner	1	-43/+24
	The irq domain request mode is now indicated in irq_alloc_info::type. Consolidate the two getter functions into one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.530546013@linutronix.de
2020-09-16	x86/irq: Add allocation type for parent domain retrieval	Thomas Gleixner	3	-7/+11
	irq_remapping_ir_irq_domain() is used to retrieve the remapping parent domain for an allocation type. irq_remapping_irq_domain() is for retrieving the actual device domain for allocating interrupts for a device. The two functions are similar and can be unified by using explicit modes for parent irq domain retrieval. Add X86_IRQ_ALLOC_TYPE_IOAPIC/HPET_GET_PARENT and use it in the iommu implementations. Drop the parent domain retrieval for PCI_MSI/X as that is unused. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.436350257@linutronix.de
2020-09-16	x86_irq_Rename_X86_IRQ_ALLOC_TYPE_MSI_to_reflect_PCI_dependency	Thomas Gleixner	2	-21/+21
	No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.343103175@linutronix.de
2020-09-16	iommu/amd: Prevent NULL pointer dereference	Thomas Gleixner	1	-2/+2
	Dereferencing irq_data before checking it for NULL is suboptimal. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Joerg Roedel <jroedel@suse.de>
2020-09-11	dma-direct: rename and cleanup __phys_to_dma	Christoph Hellwig	1	-1/+1
	The __phys_to_dma vs phys_to_dma distinction isn't exactly obvious. Try to improve the situation by renaming __phys_to_dma to phys_to_dma_unencryped, and not forcing architectures that want to override phys_to_dma to actually provide __phys_to_dma. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2020-09-07	iommu/arm-smmu-v3: permit users to disable msi polling	Barry Song	1	-5/+12
	Polling by MSI isn't necessarily faster than polling by SEV. Tests on hi1620 show hns3 100G NIC network throughput can improve from 25G to 27G if we disable MSI polling while running 16 netperf threads sending UDP packets in size 32KB. TX throughput can improve from 7G to 7.7G for single thread. The reason for the throughput improvement is that the latency to poll the completion of CMD_SYNC becomes smaller. After sending a CMD_SYNC in an empty cmd queue, typically we need to wait for 280ns using MSI polling. But we only need around 190ns after disabling MSI polling. This patch provides a command line option so that users can decide to use MSI polling or not based on their tests. Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20200827092957.22500-4-song.bao.hua@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07	iommu/arm-smmu-v3: replace module_param_named by module_param for disable_bypass	Barry Song	1	-1/+1
	Just use module_param() - going out of the way to specify a "different" name that's identical to the variable name is silly. Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20200827092957.22500-3-song.bao.hua@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07	iommu/arm-smmu-v3: replace symbolic permissions by octal permissions for ↵	Barry Song	1	-1/+1
	module parameter This fixed the below checkpatch issue: WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. 417: FILE: drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:417: module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20200827092957.22500-2-song.bao.hua@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07	iommu/arm-smmu-v3: Fix l1 stream table size in the error message	Zenghui Yu	1	-1/+1
	The actual size of level-1 stream table is l1size. This looks like an oversight on commit d2e88e7c081ef ("iommu/arm-smmu: Fix LOG2SIZE setting for 2-level stream tables") which forgot to update the @size in error message as well. As memory allocation failure is already bad enough, nothing worse would happen. But let's be careful. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200826141758.341-1-yuzenghui@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-04	iommu/tegra-smmu: Add locking around mapping operations	Dmitry Osipenko	1	-11/+84
	The mapping operations of the Tegra SMMU driver are subjected to a race condition issues because SMMU Address Space isn't allocated and freed atomically, while it should be. This patch makes the mapping operations atomic, it fixes an accidentally released Host1x Address Space problem which happens while running multiple graphics tests in parallel on Tegra30, i.e. by having multiple threads racing with each other in the Host1x's submission and completion code paths, performing IOVA mappings and unmappings in parallel. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Thierry Reding <treding@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20200901203730.27865-1-digetx@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/sun50i: Fix set-but-not-used variable warning	Joerg Roedel	1	-8/+7
	Fix the following warning the the SUN50I driver: drivers/iommu/sun50i-iommu.c: In function 'sun50i_iommu_irq': drivers/iommu/sun50i-iommu.c:890:14: warning: variable 'iova' set but not used [-Wunused-but-set-variable] 890 \| phys_addr_t iova; \| ^~~~ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200904113906.3906-1-joro@8bytes.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/mediatek: Check 4GB mode by reading infracfg	Miles Chen	1	-5/+28
	In previous discussion [1] and [2], we found that it is risky to use max_pfn or totalram_pages to tell if 4GB mode is enabled. Check 4GB mode by reading infracfg register, remove the usage of the un-exported symbol max_pfn. This is a step towards building mtk_iommu as a kernel module. [1] https://lore.kernel.org/lkml/20200603161132.2441-1-miles.chen@mediatek.com/ [2] https://lore.kernel.org/lkml/20200604080120.2628-1-miles.chen@mediatek.com/ [3] https://lore.kernel.org/lkml/20200715205120.GA778876@bogus/ Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Yong Wu <yong.wu@mediatek.com> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Rob Herring <robh@kernel.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Joerg Roedel <joro@8bytes.org> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Miles Chen <miles.chen@mediatek.com> Link: https://lore.kernel.org/r/20200904104038.4979-1-miles.chen@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/vt-d: Handle 36bit addressing for x86-32	Chris Wilson	1	-7/+7
	Beware that the address size for x86-32 may exceed unsigned long. [ 0.368971] UBSAN: shift-out-of-bounds in drivers/iommu/intel/iommu.c:128:14 [ 0.369055] shift exponent 36 is too large for 32-bit type 'long unsigned int' If we don't handle the wide addresses, the pages are mismapped and the device read/writes go astray, detected as DMAR faults and leading to device failure. The behaviour changed (from working to broken) in commit fa954e683178 ("iommu/vt-d: Delegate the dma domain to upper layer"), but the error looks older. Fixes: fa954e683178 ("iommu/vt-d: Delegate the dma domain to upper layer") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: James Sewart <jamessewart@arista.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: <stable@vger.kernel.org> # v5.3+ Link: https://lore.kernel.org/r/20200822160209.28512-1-chris@chris-wilson.co.uk Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/iova: Replace cmpxchg with xchg in queue_iova	Yuqi Jin	1	-1/+1
	The performance of the atomic_xchg is better than atomic_cmpxchg because no comparison is required. While the value of @fq_timer_on can only be 0 or 1. Let's use atomic_xchg instead of atomic_cmpxchg here because we only need to check that the value changes from 0 to 1 or from 1 to 1. Signed-off-by: Yuqi Jin <jinyuqi@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Link: https://lore.kernel.org/r/1598517834-30275-1-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/amd: Do not use IOMMUv2 functionality when SME is active	Joerg Roedel	1	-0/+7
	When memory encryption is active the device is likely not in a direct mapped domain. Forbid using IOMMUv2 functionality for now until finer grained checks for this have been implemented. Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200824105415.21000-3-joro@8bytes.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/amd: Do not force direct mapping when SME is active	Joerg Roedel	1	-1/+6
	Do not force devices supporting IOMMUv2 to be direct mapped when memory encryption is active. This might cause them to be unusable because their DMA mask does not include the encryption bit. Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200824105415.21000-2-joro@8bytes.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/dma: Remove broken huge page handling	Robin Murphy	1	-8/+5
	The attempt to handle huge page allocations was originally added since the comments around stripping __GFP_COMP in other implementations were nonsensical, and we naively assumed that split_huge_page() could simply be called equivalently to split_page(). It turns out that this doesn't actually work correctly, so just get rid of it - there's little point going to the effort of allocating huge pages if we're only going to split them anyway. Reported-by: Roman Gushchin <guro@fb.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/e287dbe69aa0933abafd97c80631940fd188ddd1.1599132844.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE	Suravee Suthikulpanit	3	-7/+33
	When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode), current driver disables interrupt remapping when it updates the IRTE so that the upper and lower 64-bit values can be updated safely. However, this creates a small window, where the interrupt could arrive and result in IO_PAGE_FAULT (for interrupt) as shown below. IOMMU Driver Device IRQ ============ =========== irte.RemapEn=0 ... change IRTE IRQ from device ==> IO_PAGE_FAULT !! ... irte.RemapEn=1 This scenario has been observed when changing irq affinity on a system running I/O-intensive workload, in which the destination APIC ID in the IRTE is updated. Instead, use cmpxchg_double() to update the 128-bit IRTE at once without disabling the interrupt remapping. However, this means several features, which require GA (128-bit IRTE) support will also be affected if cmpxchg16b is not supported (which is unprecedented for AMD processors w/ IOMMU). Fixes: 880ac60e2538 ("iommu/amd: Introduce interrupt remapping ops structure") Reported-by: Sean Osborne <sean.m.osborne@oracle.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Tested-by: Erik Rockstrom <erik.rockstrom@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20200903093822.52012-3-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/amd: Restore IRTE.RemapEn bit after programming IRTE	Suravee Suthikulpanit	1	-0/+2
	Currently, the RemapEn (valid) bit is accidentally cleared when programming IRTE w/ guestMode=0. It should be restored to the prior state. Fixes: b9fc6b56f478 ("iommu/amd: Implements irq_set_vcpu_affinity() hook to setup vapic mode for pass-through devices") Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20200903093822.52012-2-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()	Lu Baolu	1	-45/+55
	The dev_iommu_priv_set() must be called after probe_device(). This fixes a NULL pointer deference bug when booting a system with kernel cmdline "intel_iommu=on,igfx_off", where the dev_iommu_priv_set() is abused. The following stacktrace was produced: Command line: BOOT_IMAGE=/isolinux/bzImage console=tty1 intel_iommu=on,igfx_off ... DMAR: Host address width 39 DMAR: DRHD base: 0x000000fed90000 flags: 0x0 DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e DMAR: DRHD base: 0x000000fed91000 flags: 0x1 DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da DMAR: RMRR base: 0x0000009aa9f000 end: 0x0000009aabefff DMAR: RMRR base: 0x0000009d000000 end: 0x0000009f7fffff DMAR: No ATSR found BUG: kernel NULL pointer dereference, address: 0000000000000038 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.9.0-devel+ #2 Hardware name: LENOVO 20HGS0TW00/20HGS0TW00, BIOS N1WET46S (1.25s ) 03/30/2018 RIP: 0010:intel_iommu_init+0xed0/0x1136 Code: fe e9 61 02 00 00 bb f4 ff ff ff e9 57 02 00 00 48 63 d1 48 c1 e2 04 48 03 50 20 48 8b 12 48 85 d2 74 0b 48 8b 92 d0 02 00 00 48 89 7a 38 ff c1 e9 15 f5 ff ff 48 c7 c7 60 99 ac a7 49 c7 c7 a0 RSP: 0000:ffff96d180073dd0 EFLAGS: 00010282 RAX: ffff8c91037a7d20 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffff RBP: ffff96d180073e90 R08: 0000000000000001 R09: ffff8c91039fe3c0 R10: 0000000000000226 R11: 0000000000000226 R12: 000000000000000b R13: ffff8c910367c650 R14: ffffffffa8426d60 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8c9107480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000038 CR3: 00000004b100a001 CR4: 00000000003706e0 Call Trace: ? _raw_spin_unlock_irqrestore+0x1f/0x30 ? call_rcu+0x10e/0x320 ? trace_hardirqs_on+0x2c/0xd0 ? rdinit_setup+0x2c/0x2c ? e820__memblock_setup+0x8b/0x8b pci_iommu_init+0x16/0x3f do_one_initcall+0x46/0x1e4 kernel_init_freeable+0x169/0x1b2 ? rest_init+0x9f/0x9f kernel_init+0xa/0x101 ret_from_fork+0x22/0x30 Modules linked in: CR2: 0000000000000038 ---[ end trace 3653722a6f936f18 ]--- Fixes: 01b9d4e21148c ("iommu/vt-d: Use dev_iommu_priv_get/set()") Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Reported-by: Wendy Wang <wendy.wang@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Link: https://lore.kernel.org/linux-iommu/96717683-70be-7388-3d2f-61131070a96a@secunet.com/ Link: https://lore.kernel.org/r/20200903065132.16879-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/vt-d: Serialize IOMMU GCMD register modifications	Lu Baolu	1	-2/+8
	The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General Description) that: If multiple control fields in this register need to be modified, software must serialize the modifications through multiple writes to this register. However, in irq_remapping.c, modifications of IRE and CFI are done in one write. We need to do two separate writes with STS checking after each. It also checks the status register before writing command register to avoid unnecessary register write. Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lore.kernel.org/r/20200828000615.8281-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu: Rename iommu_tlb_* functions to iommu_iotlb_*	Tom Murphy	2	-3/+3
	To keep naming consistent we should stick with iotlb. This patch renames a few remaining functions. Signed-off-by: Tom Murphy <murphyt7@tcd.ie> Link: https://lore.kernel.org/r/20200817210051.13546-1-murphyt7@tcd.ie Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/tegra-smmu: Prune IOMMU group when it is released	Thierry Reding	1	-0/+13
	In order to share groups between multiple devices we keep track of them in a per-SMMU list. When an IOMMU group is released, a dangling pointer to it stays around in that list. Fix this by implementing an IOMMU data release callback for groups where the dangling pointer can be removed. Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20200806155404.3936074-4-thierry.reding@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/tegra-smmu: Balance IOMMU group reference count	Thierry Reding	1	-1/+3
	For groups that are shared between multiple devices, care must be taken to acquire a reference for each device, otherwise the IOMMU core ends up dropping the last reference too early, which will cause the group to be released while consumers may still be thinking that they're holding a reference to it. Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20200806155404.3936074-3-thierry.reding@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/tegra-smmu: Set IOMMU group name	Thierry Reding	1	-0/+1
	Set the name of static IOMMU groups to help with debugging. Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20200806155404.3936074-2-thierry.reding@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/qcom: Drop of_match_ptr to fix -Wunused-const-variable	Krzysztof Kozlowski	1	-2/+2
	The of_device_id is included unconditionally by of.h header and used in the driver as well. Remove of_match_ptr to fix W=1 compile test warning with !CONFIG_OF: drivers/iommu/qcom_iommu.c:910:34: warning: 'qcom_iommu_of_match' defined but not used [-Wunused-const-variable=] 910 \| static const struct of_device_id qcom_iommu_of_match[] = { Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200728170859.28143-3-krzk@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/vt-d: Drop kerneldoc marker from regular comment	Krzysztof Kozlowski	1	-1/+1
	Fix W=1 compile warnings (invalid kerneldoc): drivers/iommu/intel/dmar.c:389: warning: Function parameter or member 'header' not described in 'dmar_parse_one_drhd' Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200728170859.28143-2-krzk@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/amd: Fix kerneldoc comments	Krzysztof Kozlowski	1	-2/+2
	Fix W=1 compile warnings (invalid kerneldoc): drivers/iommu/amd/init.c:1586: warning: Function parameter or member 'ivrs' not described in 'get_highest_supported_ivhd_type' drivers/iommu/amd/init.c:1938: warning: Function parameter or member 'iommu' not described in 'iommu_update_intcapxt' Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200728170859.28143-1-krzk@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/amd: Add missing function prototypes to fix -Wmissing-prototypes	Krzysztof Kozlowski	1	-0/+9
	Few exported functions from AMD IOMMU driver are missing prototypes. They have declaration in arch/x86/events/amd/iommu.h but this file cannot be included in the driver. Add prototypes to fix W=1 warnings like: drivers/iommu/amd/init.c:3066:19: warning: no previous prototype for 'get_amd_iommu' [-Wmissing-prototypes] 3066 \| struct amd_iommu *get_amd_iommu(unsigned int idx) Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200727183631.16744-1-krzk@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04	iommu/mediatek: Drop of_match_ptr to fix -Wunused-const-variable	Krzysztof Kozlowski	1	-1/+1
	The of_device_id is included unconditionally by of.h header and used in the driver as well. Remove of_match_ptr to fix W=1 compile test warning with !CONFIG_OF: drivers/iommu/mtk_iommu.c:833:34: warning: 'mtk_iommu_of_ids' defined but not used [-Wunused-const-variable=] 833 \| static const struct of_device_id mtk_iommu_of_ids[] = { Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200727181842.8441-1-krzk@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-08-23	treewide: Use fallthrough pseudo-keyword	Gustavo A. R. Silva	4	-9/+7
	Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-14	dma-pool: fix coherent pool allocations for IOMMU mappings	Christoph Hellwig	1	-2/+2
	When allocating coherent pool memory for an IOMMU mapping we don't care about the DMA mask. Move the guess for the initial GFP mask into the dma_direct_alloc_pages and pass dma_coherent_ok as a function pointer argument so that it doesn't get applied to the IOMMU case. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Amit Pundir <amit.pundir@linaro.org>
2020-08-12	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds	2	-2/+3
	Merge more updates from Andrew Morton: - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction, mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util, memory-hotplug, cleanups, uaccess, migration, gup, pagemap), - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops, checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump, exec, kdump, rapidio, panic, kcov, kgdb, ipc). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits) mm/gup: remove task_struct pointer for all gup code mm: clean up the last pieces of page fault accountings mm/xtensa: use general page fault accounting mm/x86: use general page fault accounting mm/sparc64: use general page fault accounting mm/sparc32: use general page fault accounting mm/sh: use general page fault accounting mm/s390: use general page fault accounting mm/riscv: use general page fault accounting mm/powerpc: use general page fault accounting mm/parisc: use general page fault accounting mm/openrisc: use general page fault accounting mm/nios2: use general page fault accounting mm/nds32: use general page fault accounting mm/mips: use general page fault accounting mm/microblaze: use general page fault accounting mm/m68k: use general page fault accounting mm/ia64: use general page fault accounting mm/hexagon: use general page fault accounting mm/csky: use general page fault accounting ...
2020-08-12	mm: do page fault accounting in handle_mm_fault	Peter Xu	2	-2/+3
	Patch series "mm: Page fault accounting cleanups", v5. This is v5 of the pf accounting cleanup series. It originates from Gerald Schaefer's report on an issue a week ago regarding to incorrect page fault accountings for retried page fault after commit 4064b9827063 ("mm: allow VM_FAULT_RETRY for multiple times"): https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/ What this series did: - Correct page fault accounting: we do accounting for a page fault (no matter whether it's from #PF handling, or gup, or anything else) only with the one that completed the fault. For example, page fault retries should not be counted in page fault counters. Same to the perf events. - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf event is used in an adhoc way across different archs. Case (1): for many archs it's done at the entry of a page fault handler, so that it will also cover e.g. errornous faults. Case (2): for some other archs, it is only accounted when the page fault is resolved successfully. Case (3): there're still quite some archs that have not enabled this perf event. Since this series will touch merely all the archs, we unify this perf event to always follow case (1), which is the one that makes most sense. And since we moved the accounting into handle_mm_fault, the other two MAJ/MIN perf events are well taken care of naturally. - Unify definition of "major faults": the definition of "major fault" is slightly changed when used in accounting (not VM_FAULT_MAJOR). More information in patch 1. - Always account the page fault onto the one that triggered the page fault. This does not matter much for #PF handlings, but mostly for gup. More information on this in patch 25. Patchset layout: Patch 1: Introduced the accounting in handle_mm_fault(), not enabled. Patch 2-23: Enable the new accounting for arch #PF handlers one by one. Patch 24: Enable the new accounting for the rest outliers (gup, iommu, etc.) Patch 25: Cleanup GUP task_struct pointer since it's not needed any more This patch (of 25): This is a preparation patch to move page fault accountings into the general code in handle_mm_fault(). This includes both the per task flt_maj/flt_min counters, and the major/minor page fault perf events. To do this, the pt_regs pointer is passed into handle_mm_fault(). PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault handlers. So far, all the pt_regs pointer that passed into handle_mm_fault() is NULL, which means this patch should have no intented functional change. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Chris Zankel <chris@zankel.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>