From aec0e86172a79eb5e44aff1055bb953fe4d47c59 Mon Sep 17 00:00:00 2001 From: Xunlei Pang Date: Mon, 5 Dec 2016 20:09:07 +0800 Subject: iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers under kdump, it can be steadily reproduced on several different machines, the dmesg log is like: HP HPSA Driver (v 3.4.16-0) hpsa 0000:02:00.0: using doorbell to reset controller hpsa 0000:02:00.0: board ready after hard reset. hpsa 0000:02:00.0: Waiting for controller to respond to no-op DMAR: Setting identity map for device 0000:02:00.0 [0xe8000 - 0xe8fff] DMAR: Setting identity map for device 0000:02:00.0 [0xf4000 - 0xf4fff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6e000 - 0xbdf6efff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6f000 - 0xbdf7efff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf7f000 - 0xbdf82fff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf83000 - 0xbdf84fff] DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [02:00.0] fault addr fffff000 [fault reason 06] PTE Read access is not set hpsa 0000:02:00.0: controller message 03:00 timed out hpsa 0000:02:00.0: no-op failed; re-trying After some debugging, we found that the fault addr is from DMA initiated at the driver probe stage after reset(not in-flight DMA), and the corresponding pte entry value is correct, the fault is likely due to the old iommu caches of the in-flight DMA before it. Thus we need to flush the old cache after context mapping is setup for the device, where the device is supposed to finish reset at its driver probe stage and no in-flight DMA exists hereafter. I'm not sure if the hardware is responsible for invalidating all the related caches allocated in the iommu hardware before, but seems not the case for hpsa, actually many device drivers have problems in properly resetting the hardware. Anyway flushing (again) by software in kdump kernel when the device gets context mapped which is a quite infrequent operation does little harm. With this patch, the problematic machine can survive the kdump tests. CC: Myron Stowe CC: Joseph Szczypek CC: Don Brace CC: Baoquan He CC: Dave Young Fixes: 091d42e43d21 ("iommu/vt-d: Copy translation tables from old kernel") Fixes: dbcd861f252d ("iommu/vt-d: Do not re-use domain-ids from the old kernel") Fixes: cf484d0e6939 ("iommu/vt-d: Mark copied context entries") Signed-off-by: Xunlei Pang Tested-by: Don Brace Signed-off-by: Joerg Roedel --- drivers/iommu/intel-iommu.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) (limited to 'drivers') diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index c66c273dfd8a..8862dc1e88eb 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2037,6 +2037,25 @@ static int domain_context_mapping_one(struct dmar_domain *domain, if (context_present(context)) goto out_unlock; + /* + * For kdump cases, old valid entries may be cached due to the + * in-flight DMA and copied pgtable, but there is no unmapping + * behaviour for them, thus we need an explicit cache flush for + * the newly-mapped device. For kdump, at this point, the device + * is supposed to finish reset at its driver probe stage, so no + * in-flight DMA will exist, and we don't need to worry anymore + * hereafter. + */ + if (context_copied(context)) { + u16 did_old = context_domain_id(context); + + if (did_old >= 0 && did_old < cap_ndoms(iommu->cap)) + iommu->flush.flush_context(iommu, did_old, + (((u16)bus) << 8) | devfn, + DMA_CCMD_MASK_NOBIT, + DMA_CCMD_DEVICE_INVL); + } + pgd = domain->pgd; context_clear_entry(context); -- cgit v1.2.3 From 65ca7f5f7d1cdde6c25172fe6107cd16902f826f Mon Sep 17 00:00:00 2001 From: Jacob Pan Date: Tue, 6 Dec 2016 10:14:23 -0800 Subject: iommu/vt-d: Fix pasid table size encoding Different encodings are used to represent supported PASID bits and number of PASID table entries. The current code assigns ecap_pss directly to extended context table entry PTS which is wrong and could result in writing non-zero bits to the reserved fields. IOMMU fault reason 11 will be reported when reserved bits are nonzero. This patch converts ecap_pss to extend context entry pts encoding based on VT-d spec. Chapter 9.4 as follows: - number of PASID bits = ecap_pss + 1 - number of PASID table entries = 2^(pts + 5) Software assigned limit of pasid_max value is also respected to match the allocation limitation of PASID table. cc: Mika Kuoppala cc: Ashok Raj Signed-off-by: Jacob Pan Tested-by: Mika Kuoppala Fixes: 2f26e0a9c9860 ('iommu/vt-d: Add basic SVM PASID support') Signed-off-by: Joerg Roedel --- drivers/iommu/intel-iommu.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) (limited to 'drivers') diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 8862dc1e88eb..8a185250ae5a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5204,6 +5204,25 @@ static void intel_iommu_remove_device(struct device *dev) } #ifdef CONFIG_INTEL_IOMMU_SVM +#define MAX_NR_PASID_BITS (20) +static inline unsigned long intel_iommu_get_pts(struct intel_iommu *iommu) +{ + /* + * Convert ecap_pss to extend context entry pts encoding, also + * respect the soft pasid_max value set by the iommu. + * - number of PASID bits = ecap_pss + 1 + * - number of PASID table entries = 2^(pts + 5) + * Therefore, pts = ecap_pss - 4 + * e.g. KBL ecap_pss = 0x13, PASID has 20 bits, pts = 15 + */ + if (ecap_pss(iommu->ecap) < 5) + return 0; + + /* pasid_max is encoded as actual number of entries not the bits */ + return find_first_bit((unsigned long *)&iommu->pasid_max, + MAX_NR_PASID_BITS) - 5; +} + int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev) { struct device_domain_info *info; @@ -5236,7 +5255,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd if (!(ctx_lo & CONTEXT_PASIDE)) { context[1].hi = (u64)virt_to_phys(iommu->pasid_state_table); - context[1].lo = (u64)virt_to_phys(iommu->pasid_table) | ecap_pss(iommu->ecap); + context[1].lo = (u64)virt_to_phys(iommu->pasid_table) | + intel_iommu_get_pts(iommu); + wmb(); /* CONTEXT_TT_MULTI_LEVEL and CONTEXT_TT_DEV_IOTLB are both * extended to permit requests-with-PASID if the PASIDE bit -- cgit v1.2.3 From 432abf68a79332282329286d190e21fe3ac02a31 Mon Sep 17 00:00:00 2001 From: Huang Rui Date: Mon, 12 Dec 2016 07:28:26 -0500 Subject: iommu/amd: Fix the left value check of cmd buffer The generic command buffer entry is 128 bits (16 bytes), so the offset of tail and head pointer should be 16 bytes aligned and increased with 0x10 per command. When cmd buf is full, head = (tail + 0x10) % CMD_BUFFER_SIZE. So when left space of cmd buf should be able to store only two command, we should be issued one COMPLETE_WAIT additionally to wait all older commands completed. Then the left space should be increased after IOMMU fetching from cmd buf. So left check value should be left <= 0x20 (two commands). Signed-off-by: Huang Rui Fixes: ac0ea6e92b222 ('x86/amd-iommu: Improve handling of full command buffer') Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers') diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 019e02707cd5..3ef0f42984f2 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -1023,7 +1023,7 @@ again: next_tail = (tail + sizeof(*cmd)) % CMD_BUFFER_SIZE; left = (head - next_tail) % CMD_BUFFER_SIZE; - if (left <= 2) { + if (left <= 0x20) { struct iommu_cmd sync_cmd; int ret; -- cgit v1.2.3