Commit graph

162 commits

Author SHA1 Message Date
David Woodhouse
40e4aa3432 intel-iommu: Add iommu_should_identity_map() function
We do this twice, and it's about to get more complicated. This makes the
code slightly clearer about what it's doing, too.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 10:55:41 +01:00
David Woodhouse
1b7bc0a161 intel-iommu: Fix reattaching of devices to identity mapping domain
When we reattach a device to the si_domain (because it's been removed
from a VM), we weren't calling domain_context_mapping() to actually tell
the hardware about that.

We should really put the call to domain_context_mapping() into
domain_add_dev_info() -- we never call the latter without also doing the
former, and we can keep the error paths simple that way. But that's a
cleanup which can wait for 2.6.32 now.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 10:49:46 +01:00
David Woodhouse
1e4c64c46d intel-iommu: Don't set identity mapping for bypassed graphics devices
We should check iommu_dummy() _first_, because that means it's attached
to an iommu that we've just disabled completely. At the moment, we might
try to put the device into the identity mapping domain.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 10:40:44 +01:00
David Woodhouse
5a5e02a614 intel-iommu: Fix dma vs. mm page confusion with aligned_nrpages()
The aligned_nrpages() function rounds up to the next VM page, but
returns its result as a number of DMA pages.

Purely theoretical except on IA64, which doesn't boot with VT-d right
now anyway.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04 09:35:52 +01:00
David Woodhouse
6a43e574c5 intel-iommu: Don't keep freeing page zero in dma_pte_free_pagetable()
Check dma_pte_present() and only free the page if there _is_ one.
Kind of surprising that there was no warning about this.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-02 12:02:38 +01:00
David Woodhouse
75e6bf9638 intel-iommu: Introduce first_pte_in_page() to simplify PTE-setting loops
On Wed, 2009-07-01 at 16:59 -0700, Linus Torvalds wrote:
> I also _really_ hate how you do
>
>         (unsigned long)pte >> VTD_PAGE_SHIFT ==
>         (unsigned long)first_pte >> VTD_PAGE_SHIFT

Kill this, in favour of just looking to see if the incremented pte
pointer has 'wrapped' onto the next page. Which means we have to check
it _after_ incrementing it, not before.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-02 11:27:13 +01:00
David Woodhouse
7766a3fb90 intel-iommu: Use cmpxchg64_local() for setting PTEs
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01 20:27:03 +01:00
David Woodhouse
85b98276f2 intel-iommu: Warn about unmatched unmap requests
This would have found the bug in i386 pci_unmap_addr() a long time ago.
We shouldn't just silently return without doing anything.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01 19:54:37 +01:00
David Woodhouse
206a73c102 intel-iommu: Kill superfluous mapping_lock
Since we're using cmpxchg64() anyway (because that's the only way to do
an atomic 64-bit store on i386), we might as well ditch the extra
locking and just use cmpxchg64() to ensure that we don't add the page
twice.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01 19:43:37 +01:00
David Woodhouse
c85994e477 intel-iommu: Ensure that PTE writes are 64-bit atomic, even on i386
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01 19:21:24 +01:00
David Woodhouse
f3a0a52fff intel-iommu: Performance improvement for dma_pte_free_pagetable()
As with other functions, batch the CPU data cache flushes and don't keep
recalculating PTE addresses.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30 03:58:15 +01:00
David Woodhouse
3d7b0e4154 intel-iommu: Don't free too much in dma_pte_free_pagetable()
The loop condition was wrong -- we should free a PMD only if its
_entire_ range is within the range we're intending to clear. The
early-termination condition was right, but not the loop.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30 03:57:38 +01:00
David Woodhouse
1bf20f0dc5 intel-iommu: dump mappings but don't die on pte already set
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30 03:55:21 +01:00
David Woodhouse
9051aa0268 intel-iommu: Combine domain_pfn_mapping() and domain_sg_mapping()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30 03:53:31 +01:00
David Woodhouse
e1605495c7 intel-iommu: Introduce domain_sg_mapping() to speed up intel_map_sg()
Instead of calling domain_pfn_mapping() repeatedly with single or
small numbers of pages, just pass the sglist in. It can optimise the
number of cache flushes like domain_pfn_mapping() does, and gives a huge
speedup for large scatterlists.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30 03:51:30 +01:00
David Woodhouse
875764de6f intel-iommu: Simplify __intel_alloc_iova()
There's no need for the separate iommu_alloc_iova() function, and
certainly not for it to be global. Remove the underscores while we're at
it.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:39:53 +01:00
David Woodhouse
6f6a00e40a intel-iommu: Performance improvement for domain_pfn_mapping()
As with dma_pte_clear_range(), don't keep flushing a single PTE at a
time. And also micro-optimise the setting of PTE values rather than
using the helper functions to do all the masking.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:39:45 +01:00
David Woodhouse
310a5ab93c intel-iommu: Performance improvement for dma_pte_clear_range()
It's a bit silly to repeatedly call domain_flush_cache() for each PTE
individually, as we clear it. Instead, batch them up and flush a whole
range at a time. We might as well refrain from recalculating the PTE
address from scratch each time round the loop too.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:39:17 +01:00
David Woodhouse
c5395d5c4a intel-iommu: Clean up iommu_domain_identity_map()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:39:12 +01:00
David Woodhouse
1a4a45516d intel-iommu: Remove last use of PHYSICAL_PAGE_MASK, for reserving PCI BARs
This is fairly broken anyway -- it doesn't take hotplug into account.
We should probably be checking page_is_ram() instead.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:39:05 +01:00
David Woodhouse
03d6a2461a intel-iommu: Make iommu_flush_iotlb_psi() take pfn as argument
Most of its callers are having to shift for themselves anyway, so we might
as well do it in iommu_flush_iotlb_psi().

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:38:11 +01:00
David Woodhouse
88cb6a7424 intel-iommu: Change aligned_size() to aligned_nrpages()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:38:04 +01:00
David Woodhouse
b536d24d21 intel-iommu: Clean up intel_map_sg(), remove domain_page_mapping()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:35:06 +01:00
David Woodhouse
ad05122162 intel-iommu: Use domain_pfn_mapping() in intel_iommu_map_range()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:35:00 +01:00
David Woodhouse
0ab36de274 intel-iommu: Use domain_pfn_mapping() in __intel_map_single()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:34:24 +01:00
David Woodhouse
61df744314 intel-iommu: Introduce domain_pfn_mapping()
... and use it in the trivial cases; the other callers want individual
(and bisectable) attention, since I screwed them up the first time...

Make the BUG_ON() happen on too-large virtual address rather than
physical address, too. That's the one we care about.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:33:59 +01:00
David Woodhouse
1c5a46ed49 intel-iommu: Clean up address handling in domain_page_mapping()
No more masking and alignment; just use pfns.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:33:11 +01:00
David Woodhouse
b026fd28ea intel-iommu: Change addr_to_dma_pte() to pfn_to_dma_pte()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:32:26 +01:00
David Woodhouse
163cc52ccd intel-iommu: Clean up intel_iommu_unmap_range()
Use unaligned address for domain->max_addr. That algorithm isn't ideal
anyway -- we should probably just look at the last iova in the tree.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:31:12 +01:00
David Woodhouse
d794dc9b30 intel-iommu: Make dma_pte_free_pagetable() take pfns as argument
With some cleanup of intel_unmap_page(), intel_unmap_sg() and
vm_domain_exit() to no longer play with 64-bit addresses.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:30:45 +01:00
David Woodhouse
6660c63a79 intel-iommu: Make dma_pte_free_pagetable() use pfns
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:30:35 +01:00
David Woodhouse
595badf5d6 intel-iommu: Make dma_pte_clear_range() take pfns as argument
Noting that this is now an _inclusive_ range.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:28:10 +01:00
David Woodhouse
04b18e65dd intel-iommu: Make dma_pte_clear_range() use pfns
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 13:26:36 +01:00
David Woodhouse
66eae8469e intel-iommu: Don't just mask out too-big physical addresses; BUG() instead
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 12:38:42 +01:00
David Woodhouse
a75f7cf94f intel-iommu: Make dma_pte_clear_one() take pfn not address
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 12:38:40 +01:00
David Woodhouse
90dcfb5eb2 intel-iommu: Change dma_addr_level_pte() to dma_pfn_level_pte()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 12:38:38 +01:00
David Woodhouse
77dfa56c94 intel-iommu: Change address_level_offset() to pfn_level_offset()
We're shifting the inputs for now, but that'll change...

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 12:38:32 +01:00
David Woodhouse
dd4e831960 intel-iommu: Change dma_set_pte_addr() to dma_set_pte_pfn()
Add some helpers for converting between VT-d and normal system pfns,
since system pages can be larger than VT-d pages.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 12:38:11 +01:00
David Woodhouse
c7ab48d2ac intel-iommu: Clean up identity mapping code, remove CONFIG_DMAR_GFX_WA
There's no need for the GFX workaround now we have 'iommu=pt' for the
cases where people really care about performance. There's no need to
have a special case for just one type of device.

This also speeds up the iommu=pt path and reduces memory usage by
setting up the si_domain _once_ and then using it for all devices,
rather than giving each device its own private page tables.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 12:37:44 +01:00
David Woodhouse
b213203e47 intel-iommu: Create new iommu_domain_identity_map() function
We'll want to do this to a _domain_ (the si_domain) rather than a PCI device.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 12:37:42 +01:00
Yu Zhao
bf92df30df intel-iommu: Only avoid flushing device IOTLB for domain ID 0 in caching mode
In caching mode, domain ID 0 is reserved for non-present to present
mapping flush. Device IOTLB doesn't need to be flushed in this case.

Previously we were avoiding the flush for domain zero, even if the IOMMU 
wasn't in caching mode and domain zero wasn't special.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29 12:34:11 +01:00
Chris Wright
7e25a24229 intel-iommu: fix Identity Mapping to be arch independent
Drop the e820 scanning and use existing function for finding valid
RAM regions to add to 1:1 mapping.

Signed-off-by: Chris Wright <chrisw@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-26 11:26:27 +01:00
Fenghua Yu
2c2e2c389d IOMMU Identity Mapping Support (drivers/pci/intel_iommu.c)
Identity mapping for IOMMU defines a single domain to 1:1 map all PCI
devices to all usable memory.

This reduces map/unmap overhead in DMA API's and improve IOMMU
performance. On 10Gb network cards, Netperf shows no performance
degradation compared to non-IOMMU performance.

This method may lose some of DMA remapping benefits like isolation.

The patch sets up identity mapping for all PCI devices to all usable
memory. In the DMA API, there is no overhead to maintain page tables,
invalidate iotlb, flush cache etc.

32 bit DMA devices don't use identity mapping domain, in order to access
memory beyond 4GiB.

When kernel option iommu=pt, pass through is first tried. If pass
through succeeds, IOMMU goes to pass through. If pass through is not
supported in hw or fail for whatever reason, IOMMU goes to identity
mapping.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-23 22:07:54 +01:00
Linus Torvalds
687d680985 Merge git://git.infradead.org/~dwmw2/iommu-2.6.31
* git://git.infradead.org/~dwmw2/iommu-2.6.31:
  intel-iommu: Fix one last ia64 build problem in Pass Through Support
  VT-d: support the device IOTLB
  VT-d: cleanup iommu_flush_iotlb_psi and flush_unmaps
  VT-d: add device IOTLB invalidation support
  VT-d: parse ATSR in DMA Remapping Reporting Structure
  PCI: handle Virtual Function ATS enabling
  PCI: support the ATS capability
  intel-iommu: dmar_set_interrupt return error value
  intel-iommu: Tidy up iommu->gcmd handling
  intel-iommu: Fix tiny theoretical race in write-buffer flush.
  intel-iommu: Clean up handling of "caching mode" vs. IOTLB flushing.
  intel-iommu: Clean up handling of "caching mode" vs. context flushing.
  VT-d: fix invalid domain id for KVM context flush
  Fix !CONFIG_DMAR build failure introduced by Intel IOMMU Pass Through Support
  Intel IOMMU Pass Through Support

Fix up trivial conflicts in drivers/pci/{intel-iommu.c,intr_remapping.c}
2009-06-22 21:38:22 -07:00
Ingo Molnar
3d58f48ba0 Merge branch 'linus' into irq/numa
Conflicts:
	arch/mips/sibyte/bcm1480/irq.c
	arch/mips/sibyte/sb1250/irq.c

Merge reason: we gathered a few conflicts plus update to latest upstream fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-01 21:06:21 +02:00
Yu Zhao
93a23a7271 VT-d: support the device IOTLB
Enable the device IOTLB (i.e. ATS) for both the bare metal and KVM
environments.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-05-18 14:46:26 +01:00
Yu Zhao
9dd2fe8906 VT-d: cleanup iommu_flush_iotlb_psi and flush_unmaps
Make iommu_flush_iotlb_psi() and flush_unmaps() more readable.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-05-18 14:46:00 +01:00
David Woodhouse
fd18de50b9 intel-iommu: PAE memory corruption fix
PAGE_MASK is 0xFFFFF000 on i386 -- even with PAE.

So it's not sufficient to ensure that you use phys_addr_t or uint64_t
everywhere you handle physical addresses -- you also have to avoid using
the construct 'addr & PAGE_MASK', because that will strip the high 32
bits of the address.

This patch avoids that problem by using PHYSICAL_PAGE_MASK instead of
PAGE_MASK where appropriate. It leaves '& PAGE_MASK' in a few instances
that don't matter -- where it's being used on the virtual bus addresses
we're dishing out, which are 32-bit anyway.

Since PHYSICAL_PAGE_MASK is not present on other architectures, we have
to define it (to PAGE_MASK) if it's not already defined.

Maybe it would be better just to fix PAGE_MASK for i386/PAE?

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-11 07:51:01 -07:00
David Woodhouse
c416daa98a intel-iommu: Tidy up iommu->gcmd handling
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-05-10 20:32:37 +01:00
David Woodhouse
462b60f6cc intel-iommu: Fix tiny theoretical race in write-buffer flush.
In iommu_flush_write_buffer() we read iommu->gcmd before taking the
register_lock, and then we mask in the WBF bit and write it to the
register.

There is a tiny chance that something else could have _changed_
iommu->gcmd before we take the lock, but after we read it. So we could
be undoing that change.

Never actually going to have happened in practice, since nothing else
changes that register at runtime -- aside from the write-buffer flush
it's only ever touched at startup for enabling translation, etc.

But worth fixing anyway.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-05-10 20:18:18 +01:00