linux

q3k/linux

Author	SHA1	Message	Date
Avi Kivity	73b1087e61	[PATCH] KVM: MMU: Report nx faults to the guest With the recent guest page fault change, we perform access checks on our own instead of relying on the cpu. This means we have to perform the nx checks as well. Software like the google toolbar on windows appears to rely on this somehow. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:57 -08:00
Avi Kivity	7993ba43db	[PATCH] KVM: MMU: Perform access checks in walk_addr() Check pte permission bits in walk_addr(), instead of scattering the checks all over the code. This has the following benefits: 1. We no longer set the accessed bit for accessed which fail permission checks. 2. Setting the accessed bit is simplified. 3. Under some circumstances, we used to pretend a page fault was fixed when it would actually fail the access checks. This caused an unnecessary vmexit. 4. The error code for guest page faults is now correct. The fix helps netbsd further along booting, and allows kvm to pass the new mmu testsuite. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:57 -08:00
Ingo Molnar	68a99f6d37	[PATCH] KVM: Simplify mmu_alloc_roots() Small optimization/cleanup: page == page_header(page->page_hpa) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:28 -08:00
Ingo Molnar	7f7417d67e	[PATCH] KVM: Avoid oom on cr3 switch Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:28 -08:00
Avi Kivity	37a7d8b046	[PATCH] KVM: MMU: add audit code to check mappings, etc are correct Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	40907d5768	[PATCH] KVM: MMU: Flush guest tlb when reducing permissions on a pte If we reduce permissions on a pte, we must flush the cached copy of the pte from the guest's tlb. This is implemented at the moment by flushing the entire guest tlb, and can be improved by flushing just the relevant virtual address, if it is known. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	e2dec939db	[PATCH] KVM: MMU: Detect oom conditions and propagate error to userspace Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	714b93da1a	[PATCH] KVM: MMU: Replace atomic allocations by preallocated objects The mmu sometimes needs memory for reverse mapping and parent pte chains. however, we can't allocate from within the mmu because of the atomic context. So, move the allocations to a central place that can be executed before the main mmu machinery, where we can bail out on failure before any damage is done. (error handling is deffered for now, but the basic structure is there) Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	f51234c2cd	[PATCH] KVM: MMU: Free pages on kvm destruction Because mmu pages have attached rmap and parent pte chain structures, we need to zap them before freeing so the attached structures are freed. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	3bb65a22a4	[PATCH] KVM: MMU: Never free a shadow page actively serving as a root We always need cr3 to point to something valid, so if we detect that we're freeing a root page, simply push it back to the top of the active list. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	86a5ba025d	[PATCH] KVM: MMU: Page table write flood protection In fork() (or when we protect a page that is no longer a page table), we can experience floods of writes to a page, which have to be emulated. This is expensive. So, if we detect such a flood, zap the page so subsequent writes can proceed natively. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	139bdb2d9e	[PATCH] KVM: MMU: If an empty shadow page is not empty, report more info Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	5f1e0b6abc	[PATCH] KVM: MMU: Ensure freed shadow pages are clean Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	260746c03d	[PATCH] KVM: MMU: <ove is_empty_shadow_page() above kvm_mmu_free_page() Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	0e7bc4b961	[PATCH] KVM: MMU: Handle misaligned accesses to write protected guest page tables A misaligned access affects two shadow ptes instead of just one. Since a misaligned access is unlikely to occur on a real page table, just zap the page out of existence, avoiding further trouble. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	73f7198e73	[PATCH] KVM: MMU: Remove release_pt_page_64() Unused. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	5f015a5b28	[PATCH] KVM: MMU: Remove invlpg interception Since we write protect shadowed guest page tables, there is no need to trap page invalidations (the guest will always change the mapping before issuing the invlpg instruction). Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:25 -08:00
Avi Kivity	ebeace8609	[PATCH] KVM: MMU: oom handling When beginning to process a page fault, make sure we have enough shadow pages available to service the fault. If not, free some pages. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:25 -08:00
Avi Kivity	cc4529efc7	[PATCH] KVM: MMU: kvm_mmu_put_page() only removes one link to the page ... and so must not free it unconditionally. Move the freeing to kvm_mmu_zap_page(). Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:25 -08:00
Avi Kivity	697fe2e24a	[PATCH] KVM: MMU: Implement child shadow unlinking When removing a page table, we must maintain the parent_pte field all child shadow page tables. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:25 -08:00
Avi Kivity	a436036baf	[PATCH] KVM: MMU: If emulating an instruction fails, try unprotecting the page A page table may have been recycled into a regular page, and so any instruction can be executed on it. Unprotect the page and let the cpu do its thing. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:25 -08:00
Avi Kivity	9b7a032567	[PATCH] KVM: MMU: Zap shadow page table entries on writes to guest page tables Iterate over all shadow pages which correspond to a the given guest page table and remove the mappings. A subsequent page fault will reestablish the new mapping. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:25 -08:00
Avi Kivity	da4a00f002	[PATCH] KVM: MMU: Support emulated writes into RAM As the mmu write protects guest page table, we emulate those writes. Since they are not mmio, there is no need to go to userspace to perform them. So, perform the writes in the kernel if possible, and notify the mmu about them so it can take the approriate action. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:25 -08:00
Avi Kivity	815af8d42e	[PATCH] KVM: MMU: Let the walker extract the target page gfn from the pte This fixes a problem where set_pte_common() looked for shadowed pages based on the page directory gfn (a huge page) instead of the actual gfn being mapped. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:25 -08:00
Avi Kivity	374cbac033	[PATCH] KVM: MMU: Write protect guest pages when a shadow is created for them When we cache a guest page table into a shadow page table, we need to prevent further access to that page by the guest, as that would render the cache incoherent. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:25 -08:00
Avi Kivity	cea0f0e7ea	[PATCH] KVM: MMU: Shadow page table caching Define a hashtable for caching shadow page tables. Look up the cache on context switch (cr3 change) or during page faults. The key to the cache is a combination of - the guest page table frame number - the number of paging levels in the guest * we can cache real mode, 32-bit mode, pae, and long mode page tables simultaneously. this is useful for smp bootup. - the guest page table table * some kernels use a page as both a page table and a page directory. this allows multiple shadow pages to exist for that page, one per level - the "quadrant" * 32-bit mode page tables span 4MB, whereas a shadow page table spans 2MB. similarly, a 32-bit page directory spans 4GB, while a shadow page directory spans 1GB. the quadrant allows caching up to 4 shadow page tables for one guest page in one level. - a "metaphysical" bit * for real mode, and for pse pages, there is no guest page table, so set the bit to avoid write protecting the page. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:24 -08:00
Avi Kivity	25c0de2cc6	[PATCH] KVM: MMU: Make kvm_mmu_alloc_page() return a kvm_mmu_page pointer This allows further manipulation on the shadow page table. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:24 -08:00
Avi Kivity	17ac10ad2b	[PATCH] KVM: MU: Special treatment for shadow pae root pages Since we're not going to cache the pae-mode shadow root pages, allocate a single pae shadow that will hold the four lower-level pages, which will act as roots. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:24 -08:00
Avi Kivity	cd4a4e5374	[PATCH] KVM: MMU: Implement simple reverse mapping Keep in each host page frame's page->private a pointer to the shadow pte which maps it. If there are multiple shadow ptes mapping the page, set bit 0 of page->private, and use the rest as a pointer to a linked list of all such mappings. Reverse mappings are needed because we when we cache shadow page tables, we must protect the guest page tables from being modified by the guest, as that would invalidate the cached ptes. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:24 -08:00
Ingo Molnar	8018c27b26	[PATCH] kvm: fix GFP_KERNEL allocation in atomic section in kvm_dev_ioctl_create_vcpu() fix an GFP_KERNEL allocation in atomic section: kvm_dev_ioctl_create_vcpu() called kvm_mmu_init(), which calls alloc_pages(), while holding the vcpu. The fix is to set up the MMU state in two phases: kvm_mmu_create() and kvm_mmu_setup(). (NOTE: free_vcpus does an kvm_mmu_destroy() call so there's no need for any extra teardown branch on allocation/init failure here.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-30 10:56:44 -08:00
Avi Kivity	a9058ecd3c	[PATCH] KVM: Simplify is_long_mode() Instead of doing tricky stuff with the arch dependent virtualization registers, take a peek at the guest's efer. This simlifies some code, and fixes some confusion in the mmu branch. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-30 10:56:44 -08:00
Avi Kivity	2c26495710	[PATCH] KVM: Use more traditional error handling in kvm_mmu_init() Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 08:55:46 -08:00
Avi Kivity	8c7bb723b4	[PATCH] KVM: MMU: Ignore pcd, pwt, and pat bits on ptes The pcd, pwt, and pat bits on page table entries affect the cpu cache. Since the cache is a host resource, the guest should not be able to control it. Moreover, the meaning of these bits changes depending on whether pat is enabled or not. So, force these bits to zero on shadow page table entries at all times. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:47 -08:00
Avi Kivity	6aa8b732ca	[PATCH] kvm: userspace interface web site: http://kvm.sourceforge.net mailing list: kvm-devel@lists.sourceforge.net (http://lists.sourceforge.net/lists/listinfo/kvm-devel) The following patchset adds a driver for Intel's hardware virtualization extensions to the x86 architecture. The driver adds a character device (/dev/kvm) that exposes the virtualization capabilities to userspace. Using this driver, a process can run a virtual machine (a "guest") in a fully virtualized PC containing its own virtual hard disks, network adapters, and display. Using this driver, one can start multiple virtual machines on a host. Each virtual machine is a process on the host; a virtual cpu is a thread in that process. kill(1), nice(1), top(1) work as expected. In effect, the driver adds a third execution mode to the existing two: we now have kernel mode, user mode, and guest mode. Guest mode has its own address space mapping guest physical memory (which is accessible to user mode by mmap()ing /dev/kvm). Guest mode has no access to any I/O devices; any such access is intercepted and directed to user mode for emulation. The driver supports i386 and x86_64 hosts and guests. All combinations are allowed except x86_64 guest on i386 host. For i386 guests and hosts, both pae and non-pae paging modes are supported. SMP hosts and UP guests are supported. At the moment only Intel hardware is supported, but AMD virtualization support is being worked on. Performance currently is non-stellar due to the naive implementation of the mmu virtualization, which throws away most of the shadow page table entries every context switch. We plan to address this in two ways: - cache shadow page tables across tlb flushes - wait until AMD and Intel release processors with nested page tables Currently a virtual desktop is responsive but consumes a lot of CPU. Under Windows I tried playing pinball and watching a few flash movies; with a recent CPU one can hardly feel the virtualization. Linux/X is slower, probably due to X being in a separate process. In addition to the driver, you need a slightly modified qemu to provide I/O device emulation and the BIOS. Caveats (akpm: might no longer be true): - The Windows install currently bluescreens due to a problem with the virtual APIC. We are working on a fix. A temporary workaround is to use an existing image or install through qemu - Windows 64-bit does not work. That's also true for qemu, so it's probably a problem with the device model. [bero@arklinux.org: build fix] [simon.kagstrom@bth.se: build fix, other fixes] [uril@qumranet.com: KVM: Expose interrupt bitmap] [akpm@osdl.org: i386 build fix] [mingo@elte.hu: i386 fixes] [rdreier@cisco.com: add log levels to all printks] [randy.dunlap@oracle.com: Fix sparse NULL and C99 struct init warnings] [anthony@codemonkey.ws: KVM: AMD SVM: 32-bit host support] Signed-off-by: Yaniv Kamay <yaniv@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Simon Kagstrom <simon.kagstrom@bth.se> Cc: Bernhard Rosenkraenzer <bero@arklinux.org> Signed-off-by: Uri Lublin <uril@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Anthony Liguori <anthony@codemonkey.ws> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:57:22 -08:00

34 commits