Commit graph

35412 commits

Author SHA1 Message Date
Avi Kivity
069ebaa464 x86: Add cpu features MOVBE and POPCNT
Add cpu feature bit support for the MOVBE and POPCNT instructions.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:52 +03:00
Avi Kivity
7faa4ee1c7 KVM: Add AMD cpuid bit: cr8_legacy, abm, misaligned sse, sse4, 3dnow prefetch
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:52 +03:00
Avi Kivity
8d753f369b KVM: Fix cpuid feature misreporting
MTRR, PAT, MCE, and MCA are all supported (to some extent) but not reported.
Vista requires these features, so if userspace relies on kernel cpuid
reporting, it loses support for Vista.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:52 +03:00
Jan Kiszka
d6a8c875f3 KVM: Drop request_nmi from stats
The stats entry request_nmi is no longer used as the related user space
interface was dropped. So clean it up.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:51 +03:00
Gleb Natapov
fe8e7f83de KVM: SVM: Don't reinject event that caused a task switch
If a task switch caused by an event remove it from the event queue.
VMX already does that.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:51 +03:00
Andre Przywara
b586eb0253 KVM: SVM: Fix cross vendor migration issue in segment segment descriptor
On AMD CPUs sometimes the DB bit in the stack segment
descriptor is left as 1, although the whole segment has
been made unusable. Clear it here to pass an Intel VMX
entry check when cross vendor migrating.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:51 +03:00
Glauber Costa
9b5843ddd2 KVM: fix apic_debug instances
Apparently nobody turned this on in a while...
setting apic_debug to something compilable, generates
some errors. This patch fixes it.

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:50 +03:00
Sheng Yang
522c68c441 KVM: Enable snooping control for supported hardware
Memory aliases with different memory type is a problem for guest. For the guest
without assigned device, the memory type of guest memory would always been the
same as host(WB); but for the assigned device, some part of memory may be used
as DMA and then set to uncacheable memory type(UC/WC), which would be a conflict of
host memory type then be a potential issue.

Snooping control can guarantee the cache correctness of memory go through the
DMA engine of VT-d.

[avi: fix build on ia64]

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:50 +03:00
Sheng Yang
4b12f0de33 KVM: Replace get_mt_mask_shift with get_mt_mask
Shadow_mt_mask is out of date, now it have only been used as a flag to indicate
if TDP enabled. Get rid of it and use tdp_enabled instead.

Also put memory type logical in kvm_x86_ops->get_mt_mask().

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:49 +03:00
Jan Blunck
9b62e5b10f KVM: Wake up waitqueue before calling get_cpu()
This moves the get_cpu() call down to be called after we wake up the
waiters. Therefore the waitqueue locks can safely be rt mutex.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Sven-Thorsten Dietrich <sven@thebigcorporation.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:49 +03:00
Gleb Natapov
14d0bc1f7c KVM: Get rid of get_irq() callback
It just returns pending IRQ vector from the queue for VMX/SVM.
Get IRQ directly from the queue before migration and put it back
after.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:49 +03:00
Gleb Natapov
16d7a19117 KVM: Fix userspace IRQ chip migration
Re-put pending IRQ vector into interrupt_bitmap before migration.
Otherwise it will be lost if migration happens in the wrong time.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:48 +03:00
Gleb Natapov
95ba827313 KVM: SVM: Add NMI injection support
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:48 +03:00
Gleb Natapov
c4282df98a KVM: Get rid of arch.interrupt_window_open & arch.nmi_window_open
They are recalculated before each use anyway.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:48 +03:00
Gleb Natapov
0a5fff1923 KVM: Do not report TPR write to userspace if new value bigger or equal to a previous one.
Saves many exits to userspace in a case of IRQ chip in userspace.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:47 +03:00
Gleb Natapov
615d519305 KVM: sync_lapic_to_cr8() should always sync cr8 to V_TPR
Even if IRQ chip is in userspace.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:47 +03:00
Gleb Natapov
115666dfc7 KVM: Remove kvm_push_irq()
No longer used.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:47 +03:00
Gleb Natapov
1d6ed0cb95 KVM: Remove inject_pending_vectors() callback
It is the same as inject_pending_irq() for VMX/SVM now.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:47 +03:00
Gleb Natapov
1cb948ae86 KVM: Remove exception_injected() callback.
It always return false for VMX/SVM now.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:46 +03:00
Gleb Natapov
9222be18f7 KVM: SVM: Coalesce userspace/kernel irqchip interrupt injection logic
Start to use interrupt/exception queues like VMX does.
This also fix the bug that if exit was caused by a guest
internal exception access to IDT the exception was not
reinjected.

Use EVENTINJ to inject interrupts.  Use VINT only for detecting when IRQ
windows is open again.  EVENTINJ ensures
the interrupt is injected immediately and not delayed.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:46 +03:00
Gleb Natapov
5df5664647 KVM: Use kvm_arch_interrupt_allowed() instead of checking interrupt_window_open directly
kvm_arch_interrupt_allowed() also checks IF so drop the check.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:46 +03:00
Gleb Natapov
1f21e79aac KVM: VMX: Cleanup vmx_intr_assist()
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:45 +03:00
Gleb Natapov
863e8e658e KVM: VMX: Consolidate userspace and kernel interrupt injection for VMX
Use the same callback to inject irq/nmi events no matter what irqchip is
in use. Only from VMX for now.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:45 +03:00
Gleb Natapov
8061823a25 KVM: Make kvm_cpu_(has|get)_interrupt() work for userspace irqchip too
At the vector level, kernel and userspace irqchip are fairly similar.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:45 +03:00
Jan Kiszka
3438253926 KVM: MMU: Fix auditing code
Fix build breakage of hpa lookup in audit_mappings_page. Moreover, make
this function robust against shadow_notrap_nonpresent_pte entries.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:45 +03:00
Jes Sorensen
43890ae8bc KVM: ia64: ia64 vcpu_reset() do not call kmalloc() with irqs disabled
Restore local irq enabled state before calling kvm_arch_vcpu_init(),
which calls kmalloc(GFP_KERNEL).

Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:44 +03:00
Jes Sorensen
4d13c3b04f KVM: ia64: preserve int status through call to kvm_insert_vmm_mapping
Preserve interrupt status around call to kvm_insert_vmm_mappin()
in kvm_vcpu_pre_transition().

Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:44 +03:00
Jes Sorensen
457459c3c7 KVM: ia64: restore irq state before calling kvm_vcpu_init
Make sure to restore the psr after calling kvm_insert_vmm_mapping()
which calls ia64_itr_entry() as it disables local interrupts and
kvm_vcpu_init() may sleep.

Avoids a warning from the lock debugging code.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by : Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:44 +03:00
Jes Sorensen
f9b647adda KVM: ia64: remove empty function vti_vcpu_load()
vti_vcpu_load() doesn't do anything, so lets get rid of it.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by : Xiantao Zhang<xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:44 +03:00
Xiantao Zhang
64f6afbd4c KVM: ia64: Flush all TLBs once guest's memory mapping changes.
Flush all vcpu's TLB entries once changes guest's memory mapping.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:43 +03:00
Marcelo Tosatti
59839dfff5 KVM: x86: check for cr3 validity in ioctl_set_sregs
Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity
checking for the new cr3 value:

"Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to
the kernel. This will trigger a NULL pointer access in gfn_to_rmap()
when userspace next tries to call KVM_RUN on the affected VCPU and kvm
attempts to activate the new non-existent page table root.

This happens since kvm only validates that cr3 points to a valid guest
physical memory page when code *inside* the guest sets cr3. However, kvm
currently trusts the userspace caller (e.g. QEMU) on the host machine to
always supply a valid page table root, rather than properly validating
it along with the rest of the reloaded guest state."

http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2687641&group_id=180599

Check for a valid cr3 address in kvm_arch_vcpu_ioctl_set_sregs, triple
fault in case of failure.

Cc: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:43 +03:00
Jes Sorensen
c6b60c6921 KVM: ia64: Don't hold slots_lock in guest mode
Reorder locking to avoid holding the slots_lock when entering
the guest.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by : Xiantao Zhang<xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:43 +03:00
Avi Kivity
463656c000 KVM: Replace kvmclock open-coded get_cpu_var() with the real thing
Suggested by Ingo Molnar.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:42 +03:00
Gleb Natapov
8317c298ea KVM: SVM: Skip instruction on a task switch only when appropriate
If a task switch was initiated because off a task gate in IDT and IDT
was accessed because of an external even the instruction should not
be skipped.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:42 +03:00
Gleb Natapov
ba8afb6b0a KVM: x86 emulator: Add new mode of instruction emulation: skip
In the new mode instruction is decoded, but not executed. The EIP
is moved to point after the instruction.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:42 +03:00
Gleb Natapov
e637b8238a KVM: x86 emulator: Decode soft interrupt instructions
Do not emulate them yet.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:41 +03:00
Gleb Natapov
84ce66a686 KVM: x86 emulator: Completely decode in/out at decoding stage
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:41 +03:00
Gleb Natapov
341de7e372 KVM: x86 emulator: Add unsigned byte immediate decode
Extend "Source operand type" opcode description field to 4 bites
to accommodate new option.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:41 +03:00
Gleb Natapov
d53c4777b3 KVM: x86 emulator: Complete decoding of call near in decode stage
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:41 +03:00
Gleb Natapov
b2833e3cde KVM: x86 emulator: Complete short/near jcc decoding in decode stage
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:40 +03:00
Gleb Natapov
782b877c80 KVM: x86 emulator: Complete ljmp decoding at decode stage
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:40 +03:00
Gleb Natapov
0654169e73 KVM: x86 emulator: Add lcall decoding
No emulation yet.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:40 +03:00
Gleb Natapov
a5f868bd45 KVM: x86 emulator: Add decoding of 16bit second immediate argument
Such as segment number in lcall/ljmp

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:40 +03:00
Marcelo Tosatti
c2d0ee46e6 KVM: MMU: remove global page optimization logic
Complexity to fix it not worthwhile the gains, as discussed
in http://article.gmane.org/gmane.comp.emulators.kvm.devel/28649.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:39 +03:00
Marcelo Tosatti
ede2ccc517 KVM: PIT: fix count read and mode 0 handling
Commit 46ee278652f4cbd51013471b64c7897ba9bcd1b1 causes Solaris 10
to hang on boot.

Assuming that PIT counter reads should return 0 for an expired timer
is wrong: when it is active, the counter never stops (see comment on
__kpit_elapsed).

Also arm a one shot timer for mode 0.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:39 +03:00
Zhang, Xiantao
2906e79f21 KVM: ia64: make kvm depend on CONFIG_MODULES.
Since kvm-intel modue can't be built-in, make kvm depend on
CONFIG_MODULES.

Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:38 +03:00
Gleb Natapov
2d03319654 KVM: x86 emulator: fix call near emulation
The length of pushed on to the stack return address depends on operand
size not address size.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:38 +03:00
Sheng Yang
4c26b4cd6f KVM: MMU: Discard reserved bits checking on PDE bit 7-8
1. It's related to a Linux kernel bug which fixed by Ingo on
07a66d7c53. The original code exists for quite a
long time, and it would convert a PDE for large page into a normal PDE. But it
fail to fit normal PDE well.  With the code before Ingo's fix, the kernel would
fall reserved bit checking with bit 8 - the remaining global bit of PTE. So the
kernel would receive a double-fault.

2. After discussion, we decide to discard PDE bit 7-8 reserved checking for now.
For this marked as reserved in SDM, but didn't checked by the processor in
fact...

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:38 +03:00
Gleb Natapov
64a7ec0668 KVM: Fix unneeded instruction skipping during task switching.
There is no need to skip instruction if the reason for a task switch
is a task gate in IDT and access to it is caused by an external even.
The problem  is currently solved only for VMX since there is no reliable
way to skip an instruction in SVM. We should emulate it instead.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:38 +03:00
Gleb Natapov
b237ac37a1 KVM: Fix task switch back link handling.
Back link is written to a wrong TSS now.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:37 +03:00
Gleb Natapov
8843419048 KVM: VMX: Do not zero idt_vectoring_info in vmx_complete_interrupts().
We will need it later in task_switch().
Code in handle_exception() is dead. is_external_interrupt(vect_info)
will always be false since idt_vectoring_info is zeroed in
vmx_complete_interrupts().

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:37 +03:00
Gleb Natapov
37b96e9880 KVM: VMX: Rewrite vmx_complete_interrupt()'s twisted maze of if() statements
...with a more straightforward switch().

Also fix a bug when NMI could be dropped on exit. Although this should
never happen in practice, since NMIs can only be injected, never triggered
internally by the guest like exceptions.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:37 +03:00
Gleb Natapov
7b4a25cb29 KVM: VMX: Fix handling of a fault during NMI unblocked due to IRET
Bit 12 is undefined in any of the following cases:
 If the VM exit sets the valid bit in the IDT-vectoring information field.
 If the VM exit is due to a double fault.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:36 +03:00
Dong, Eddie
20c466b561 KVM: Use rsvd_bits_mask in load_pdptrs()
Also remove bit 5-6 from rsvd_bits_mask per latest SDM.

Signed-off-by: Eddie Dong <Eddie.Dong@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:36 +03:00
Sheng Yang
93ba03c2e2 KVM: VMX: Fix feature testing
The testing of feature is too early now, before vmcs_config complete initialization.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:36 +03:00
Sheng Yang
045471563d KVM: VMX: Clean up Flex Priority related
And clean paranthes on returns.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:36 +03:00
Wei Yongjun
7a6ce84c74 KVM: remove pointless conditional before kfree() in lapic initialization
Remove pointless conditional before kfree().

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:35 +03:00
Avi Kivity
9645bb56b3 KVM: MMU: Use different shadows when EFER.NXE changes
A pte that is shadowed when the guest EFER.NXE=1 is not valid when
EFER.NXE=0; if bit 63 is set, the pte should cause a fault, and since the
shadow EFER always has NX enabled, this won't happen.

Fix by using a different shadow page table for different EFER.NXE bits.  This
allows vcpus to run correctly with different values of EFER.NXE, and for
transitions on this bit to be handled correctly without requiring a full
flush.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:35 +03:00
Dong, Eddie
82725b20e2 KVM: MMU: Emulate #PF error code of reserved bits violation
Detect, indicate, and propagate page faults where reserved bits are set.
Take care to handle the different paging modes, each of which has different
sets of reserved bits.

[avi: fix pte reserved bits for efer.nxe=0]

Signed-off-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:35 +03:00
Yang Zhang
362c1055e5 KVM: ia64: enable external interrupt in vmm
Currently, the interrupt enable bit is cleared when in
the vmm.  This patch sets the bit and the external interrupts can
be dealt with when in the vmm.  This improves the I/O performance.

Signed-off-by: Yang Zhang <yang.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:34 +03:00
Eddie Dong
a8b876b1a4 KVM: MMU: Fix comment in page_fault()
The original one is for the code before refactoring.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:34 +03:00
Sheng Yang
f9c617f611 KVM: VMX: Correct wrong vmcs field sizes
EXIT_QUALIFICATION and GUEST_LINEAR_ADDRESS are natural width, not 64-bit.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:34 +03:00
Avi Kivity
7d433b9f94 KVM: VMX: Make flexpriority module parameter reflect hardware capability
If the hardware does not support flexpriority, zero the module parameter.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:34 +03:00
Gleb Natapov
78646121e9 KVM: Fix interrupt unhalting a vcpu when it shouldn't
kvm_vcpu_block() unhalts vpu on an interrupt/timer without checking
if interrupt window is actually opened.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:33 +03:00
Gleb Natapov
09cec75488 KVM: Timer event should not unconditionally unhalt vcpu.
Currently timer events are processed before entering guest mode. Move it
to main vcpu event loop since timer events should be processed even while
vcpu is halted.  Timer may cause interrupt/nmi to be injected and only then
vcpu will be unhalted.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:33 +03:00
Avi Kivity
089d034e0c KVM: VMX: Fold vm_need_ept() into callers
Trivial.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:32 +03:00
Avi Kivity
575ff2dcb2 KVM: VMX: Zero ept module parameter if ept is not present
Allows reading back hardware capability.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:32 +03:00
Avi Kivity
919818abc2 KVM: VMX: Zero the vpid module parameter if vpid is not supported
This allows reading back how the hardware is configured.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:32 +03:00
Avi Kivity
4462d21a61 KVM: VMX: Annotate module parameters as __read_mostly
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:32 +03:00
Avi Kivity
736caefe15 KVM: VMX: Simplify module parameter names
Instead of 'enable_vpid=1', use a simple 'vpid=1'.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:31 +03:00
Avi Kivity
6062d012ed KVM: VMX: Rename kvm_handle_exit() to vmx_handle_exit()
It is a static vmx-specific function.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:31 +03:00
Avi Kivity
c1f8bc04c6 KVM: VMX: Make module parameters readable
Useful to see how the module was loaded.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:31 +03:00
Gleb Natapov
fe4c7b1914 KVM: reuse (pop|push)_irq from svm.c in vmx.c
The prioritized bit vector manipulation functions are useful in both vmx and
svm.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:31 +03:00
Gleb Natapov
61c50edfcd KVM: SVM: Remove duplicate code in svm_do_inject_vector()
svm_do_inject_vector() reimplements pop_irq().

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:30 +03:00
Amit Shah
7fe29e0faa KVM: x86: Ignore reads to EVNTSEL MSRs
We ignore writes to the performance counters and performance event
selector registers already. Kaspersky antivirus reads the eventsel
MSR causing it to crash with the current behaviour.

Return 0 as data when the eventsel registers are read to stop the
crash.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:30 +03:00
Gleb Natapov
f00be0cae4 KVM: MMU: do not free active mmu pages in free_mmu_pages()
free_mmu_pages() should only undo what alloc_mmu_pages() does.
Free mmu pages from the generic VM destruction function, kvm_destroy_vm().

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:30 +03:00
Sheng Yang
e56d532f20 KVM: Device assignment framework rework
After discussion with Marcelo, we decided to rework device assignment framework
together. The old problems are kernel logic is unnecessary complex. So Marcelo
suggest to split it into a more elegant way:

1. Split host IRQ assign and guest IRQ assign. And userspace determine the
combination. Also discard msi2intx parameter, userspace can specific
KVM_DEV_IRQ_HOST_MSI | KVM_DEV_IRQ_GUEST_INTX in assigned_irq->flags to
enable MSI to INTx convertion.

2. Split assign IRQ and deassign IRQ. Import two new ioctls:
KVM_ASSIGN_DEV_IRQ and KVM_DEASSIGN_DEV_IRQ.

This patch also fixed the reversed _IOR vs _IOW in definition(by deprecated the
old interface).

[avi: replace homemade bitcount() by hweight_long()]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:29 +03:00
Hannes Eder
386eb6e8b3 KVM: make 'lapic_timer_ops' and 'kpit_ops' static
Fix this sparse warnings:
  arch/x86/kvm/lapic.c:916:22: warning: symbol 'lapic_timer_ops' was not declared. Should it be static?
  arch/x86/kvm/i8254.c:268:22: warning: symbol 'kpit_ops' was not declared. Should it be static?

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:29 +03:00
Jes Sorensen
0b5d7a2ccb KVM: ia64: Drop in SN2 replacement of fast path ITC emulation fault handler
Copy in SN2 RTC based ITC emulation for fast exit. The two versions
have the same size, so a dropin is simpler than patching the branch
instruction to hit the SN2 version.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:29 +03:00
Jes Sorensen
ce17c64373 KVM: ia64: SN2 adjust emulated ITC frequency to match RTC frequency
On SN2 do not pass down the real ITC frequency, but rather patch the
values to match the SN2 RTC frequency.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:29 +03:00
Jes Sorensen
c6c9fcdf0f KVM: ia64: Create inline function kvm_get_itc() to centralize ITC reading.
Move all reading of special register 'AR_ITC' into two functions, one
in the kernel and one in the VMM module. When running on SN2, base the
result on the RTC rather the system ITC, as the ITC isn't
synchronized.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:28 +03:00
Jes Sorensen
0c72ea7fb8 KVM: ia64: Map in SN2 RTC registers to the VMM module
On SN2, map in the SN2 RTC registers to the VMM module, needed for ITC
emulation.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:28 +03:00
Gleb Natapov
58c2dde17d KVM: APIC: get rid of deliver_bitmask
Deliver interrupt during destination matching loop.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-06-10 11:48:27 +03:00
Gleb Natapov
e1035715ef KVM: change the way how lowest priority vcpu is calculated
The new way does not require additional loop over vcpus to calculate
the one with lowest priority as one is chosen during delivery bitmap
construction.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-06-10 11:48:27 +03:00
Gleb Natapov
343f94fe4d KVM: consolidate ioapic/ipi interrupt delivery logic
Use kvm_apic_match_dest() in kvm_get_intr_delivery_bitmask() instead
of duplicating the same code. Use kvm_get_intr_delivery_bitmask() in
apic_send_ipi() to figure out ipi destination instead of reimplementing
the logic.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-06-10 11:48:27 +03:00
Gleb Natapov
6da7e3f643 KVM: APIC: kvm_apic_set_irq deliver all kinds of interrupts
Get rid of ioapic_inj_irq() and ioapic_inj_nmi() functions.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-06-10 11:48:26 +03:00
Joerg Roedel
f5a1e9f895 KVM: MMU: remove call to kvm_mmu_pte_write from walk_addr
There is no reason to update the shadow pte here because the guest pte
is only changed to dirty state.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-06-10 11:48:26 +03:00
Yang Zhang
3f5e06f879 KVM: ia64: fix compilation error in kvm_get_lowest_prio_vcpu
Modify the arg of kvm_get_lowest_prio_vcpu().
Make it consistent with its declaration.

Signed-off-by: Yang Zhang <yang.zhang@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-06-10 11:48:25 +03:00
Marcelo Tosatti
d3c7b77d1a KVM: unify part of generic timer handling
Hide the internals of vcpu awakening / injection from the in-kernel
emulated timers. This makes future changes in this logic easier and
decreases the distance to more generic timer handling.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:25 +03:00
Marcelo Tosatti
fd66842370 KVM: PIT: remove usage of count_load_time for channel 0
We can infer elapsed time from hrtimer_expires_remaining.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:25 +03:00
Marcelo Tosatti
5a05d54554 KVM: PIT: remove unused scheduled variable
Unused.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:25 +03:00
Marcelo Tosatti
a90ede7b17 KVM: x86: paravirt skip pit-through-ioapic boot check
Skip the test which checks if the PIT is properly routed when
using the IOAPIC, aimed at buggy hardware.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:24 +03:00
Matt T. Yourst
2dea4c84bc KVM: x86: silence preempt warning on kvm_write_guest_time
This issue just appeared in kvm-84 when running on 2.6.28.7 (x86-64)
with PREEMPT enabled.

We're getting syslog warnings like this many (but not all) times qemu
tells KVM to run the VCPU:

BUG: using smp_processor_id() in preemptible [00000000] code:
qemu-system-x86/28938
caller is kvm_arch_vcpu_ioctl_run+0x5d1/0xc70 [kvm]
Pid: 28938, comm: qemu-system-x86 2.6.28.7-mtyrel-64bit
Call Trace:
debug_smp_processor_id+0xf7/0x100
kvm_arch_vcpu_ioctl_run+0x5d1/0xc70 [kvm]
? __wake_up+0x4e/0x70
? wake_futex+0x27/0x40
kvm_vcpu_ioctl+0x2e9/0x5a0 [kvm]
enqueue_hrtimer+0x8a/0x110
_spin_unlock_irqrestore+0x27/0x50
vfs_ioctl+0x31/0xa0
do_vfs_ioctl+0x74/0x480
sys_futex+0xb4/0x140
sys_ioctl+0x99/0xa0
system_call_fastpath+0x16/0x1b

As it turns out, the call trace is messed up due to gcc's inlining, but
I isolated the problem anyway: kvm_write_guest_time() is being used in a
non-thread-safe manner on preemptable kernels.

Basically kvm_write_guest_time()'s body needs to be surrounded by
preempt_disable() and preempt_enable(), since the kernel won't let us
query any per-CPU data (indirectly using smp_processor_id()) without
preemption disabled. The attached patch fixes this issue by disabling
preemption inside kvm_write_guest_time().

[marcelo: surround only __get_cpu_var calls since the warning
is harmless]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:24 +03:00
Sheng Yang
d510d6cc65 KVM: Enable MSI-X for KVM assigned device
This patch finally enable MSI-X.

What we need for MSI-X:
1. Intercept one page in MMIO region of device. So that we can get guest desired
MSI-X table and set up the real one. Now this have been done by guest, and
transfer to kernel using ioctl KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY.

2. Information for incoming interrupt. Now one device can have more than one
interrupt, and they are all handled by one workqueue structure. So we need to
identify them. The previous patch enable gsi_msg_pending_bitmap get this done.

3. Mapping from host IRQ to guest gsi as well as guest gsi to real MSI/MSI-X
message address/data. We used same entry number for the host and guest here, so
that it's easy to find the correlated guest gsi.

What we lack for now:
1. The PCI spec said nothing can existed with MSI-X table in the same page of
MMIO region, except pending bits. The patch ignore pending bits as the first
step (so they are always 0 - no pending).

2. The PCI spec allowed to change MSI-X table dynamically. That means, the OS
can enable MSI-X, then mask one MSI-X entry, modify it, and unmask it. The patch
didn't support this, and Linux also don't work in this way.

3. The patch didn't implement MSI-X mask all and mask single entry. I would
implement the former in driver/pci/msi.c later. And for single entry, userspace
should have reposibility to handle it.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:23 +03:00
Sheng Yang
bfd349d073 KVM: bit ops for deliver_bitmap
It's also convenient when we extend KVM supported vcpu number in the future.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:22 +03:00
Sheng Yang
110c2faeba KVM: Update intr delivery func to accept unsigned long* bitmap
Would be used with bit ops, and would be easily extended if KVM_MAX_VCPUS is
increased.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:22 +03:00
Avi Kivity
5897297bc2 KVM: VMX: Don't intercept MSR_KERNEL_GS_BASE
Windows 2008 accesses this MSR often on context switch intensive workloads;
since we run in guest context with the guest MSR value loaded (so swapgs can
work correctly), we can simply disable interception of rdmsr/wrmsr for this
MSR.

A complication occurs since in legacy mode, we run with the host MSR value
loaded. In this case we enable interception.  This means we need two MSR
bitmaps, one for legacy mode and one for long mode.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:21 +03:00
Avi Kivity
3e7c73e9b1 KVM: VMX: Don't use highmem pages for the msr and pio bitmaps
Highmem pages are a pain, and saving three lowmem pages on i386 isn't worth
the extra code.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10 11:48:21 +03:00
Paul Mundt
c1d0d32a60 sh: plug vsyscall dir in to archclean.
The vsyscall targets are presently not cleaned up, so just handle it in
the archclean rule.

Reported-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-06-10 09:48:33 +03:00
Chuck Ebbert
0b8c3d5ab0 x86: Clear TS in irq_ts_save() when in an atomic section
The dynamic FPU context allocation changes caused the padlock driver
to generate the below warning. Fix it by masking TS when doing padlock
encryption operations in an atomic section.

This solves:

BUG: sleeping function called from invalid context at mm/slub.c:1602
in_atomic(): 1, irqs_disabled(): 0, pid: 82, name: cryptomgr_test
Pid: 82, comm: cryptomgr_test Not tainted 2.6.29.4-168.test7.fc11.x86_64 #1
Call Trace:
[<ffffffff8103ff16>] __might_sleep+0x10b/0x110
[<ffffffff810cd3b2>] kmem_cache_alloc+0x37/0xf1
[<ffffffff81018505>] init_fpu+0x49/0x8a
[<ffffffff81012a83>] math_state_restore+0x3e/0xbc
[<ffffffff813ac6d0>] do_device_not_available+0x9/0xb
[<ffffffff810123ab>] device_not_available+0x1b/0x20
[<ffffffffa001c066>] ? aes_crypt+0x66/0x74 [padlock_aes]
[<ffffffff8119a51a>] ? blkcipher_walk_next+0x257/0x2e0
[<ffffffff8119a731>] ? blkcipher_walk_first+0x18e/0x19d
[<ffffffffa001c1fe>] aes_encrypt+0x9d/0xe5 [padlock_aes]
[<ffffffffa0027253>] crypt+0x6b/0x114 [xts]
[<ffffffffa001c161>] ? aes_encrypt+0x0/0xe5 [padlock_aes]
[<ffffffffa001c161>] ? aes_encrypt+0x0/0xe5 [padlock_aes]
[<ffffffffa0027390>] encrypt+0x49/0x4b [xts]
[<ffffffff81199acc>] async_encrypt+0x3c/0x3e
[<ffffffff8119dafc>] test_skcipher+0x1da/0x658
[<ffffffff811979c3>] ? crypto_spawn_tfm+0x8e/0xb1
[<ffffffff8119672d>] ? __crypto_alloc_tfm+0x11b/0x15f
[<ffffffff811979c3>] ? crypto_spawn_tfm+0x8e/0xb1
[<ffffffff81199dbe>] ? skcipher_geniv_init+0x2b/0x47
[<ffffffff8119a905>] ? async_chainiv_init+0x5c/0x61
[<ffffffff8119dfdd>] alg_test_skcipher+0x63/0x9b
[<ffffffff8119e1bc>] alg_test+0x12d/0x175
[<ffffffff8119c488>] cryptomgr_test+0x38/0x54
[<ffffffff8119c450>] ? cryptomgr_test+0x0/0x54
[<ffffffff8105c6c9>] kthread+0x4d/0x78
[<ffffffff8101264a>] child_rip+0xa/0x20
[<ffffffff81011f67>] ? restore_args+0x0/0x30
[<ffffffff8105c67c>] ? kthread+0x0/0x78
[<ffffffff81012640>] ? child_rip+0x0/0x20

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20090609104050.50158cfe@dhcp-100-2-144.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-09 16:50:43 +02:00
Yong Wang
fecc8ac849 perf_counter, x86: Correct some event and umask values for Intel processors
Correct some event and UMASK values according to Intel SDM,
in the Nehalem and Atom tables.

Signed-off-by: Yong Wang <yong.y.wang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20090609131553.GA12489@ywang-moblin2.bj.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-09 16:50:07 +02:00
Andreas Herrmann
42937e81a8 x86: Detect use of extended APIC ID for AMD CPUs
Booting a 32-bit kernel on Magny-Cours results in the following panic:

  ...
  Using APIC driver default
  ...
  Overriding APIC driver with bigsmp
  ...
  Getting VERSION: 80050010
  Getting VERSION: 80050010
  Getting ID: 10000000
  Getting ID: ef000000
  Getting LVT0: 700
  Getting LVT1: 10000
  Kernel panic - not syncing: Boot APIC ID in local APIC unexpected (16 vs 0)
  Pid: 1, comm: swapper Not tainted 2.6.30-rcX #2
  Call Trace:
   [<c05194da>] ? panic+0x38/0xd3
   [<c0743102>] ? native_smp_prepare_cpus+0x259/0x31f
   [<c073b19d>] ? kernel_init+0x3e/0x141
   [<c073b15f>] ? kernel_init+0x0/0x141
   [<c020325f>] ? kernel_thread_helper+0x7/0x10

The reason is that default_get_apic_id handled extension of local APIC
ID field just in case of XAPIC.

Thus for this AMD CPU, default_get_apic_id() returns 0 and
bigsmp_get_apic_id() returns 16 which leads to the respective kernel
panic.

This patch introduces a Linux specific feature flag to indicate
support for extended APIC id (8 bits instead of 4 bits width) and sets
the flag on AMD CPUs if applicable.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: <stable@kernel.org>
LKML-Reference: <20090608135509.GA12431@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-09 15:28:46 +02:00
Yinghai Lu
eaa958402e cpumask: alloc zeroed cpumask for static cpumask_var_ts
These are defined as static cpumask_var_t so if MAXSMP is not used,
they are cleared already.  Avoid surprises when MAXSMP is enabled.

Signed-off-by: Yinghai Lu <yinghai.lu@kernel.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-09 22:30:27 +09:30
Joerg Roedel
e9a22a13c7 amd-iommu: remove unnecessary "AMD IOMMU: " prefix
That prefix is already included in the DUMP_printk macro. So there is no
need to repeat it in the format string.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-09 12:01:58 +02:00
Joerg Roedel
71ff3bca2f amd-iommu: detach device explicitly before attaching it to a new domain
This fixes a bug with a device that could not be assigned to a KVM guest
because it is still assigned to a dma_ops protection domain.

[chrisw: simply remove WARN_ON(), will always fire since dev->driver
will be pci-sub]

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-09 11:14:14 +02:00
Joerg Roedel
29150078d7 amd-iommu: remove BUS_NOTIFY_BOUND_DRIVER handling
Handling this event causes device assignment in KVM to fail because the
device gets re-attached as soon as the pci-stub registers as the driver
for the device.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-09 10:54:18 +02:00
Joerg Roedel
d2dd01de99 Merge commit 'tip/core/iommu' into amd-iommu/fixes 2009-06-09 10:50:57 +02:00
Thomas Gleixner
820a644211 perf_counter, x86: Clean up hw_cache_event ids copies
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-08 23:10:42 +02:00
Thomas Gleixner
f86748e91a perf_counter, x86: Implement generalized cache event types, add AMD support
Fill in amd_hw_cache_event_id[] with the AMD CPU specific events,
for family 0x0f, 0x10 and 0x11.

There's apparently no distinction between load and store events, so
we only fill in the load events.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-08 23:10:37 +02:00
Andreas Herrmann
c9690998ef x86: memtest: remove 64-bit division
Using gcc 3.3.5 a "make allmodconfig" + "CONFIG_KVM=n"
triggers a build error:

 arch/x86/mm/built-in.o(.init.text+0x43f7): In function `__change_page_attr':
 arch/x86/mm/pageattr.c:114: undefined reference to `__udivdi3'
 make: *** [.tmp_vmlinux1] Error 1

The culprit turned out to be a division in arch/x86/mm/memtest.c
For more info see this thread:

  http://marc.info/?l=linux-kernel&m=124416232620683

The patch entirely removes the division that caused the build
error.

[ Impact: build fix with certain GCC versions ]

Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: xiyou.wangcong@gmail.com
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org>
LKML-Reference: <20090608170939.GB12431@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-08 19:18:25 +02:00
Jack Steiner
c4ed3f04ba x86, UV: Fix macros for multiple coherency domains
Fix bug in the SGI UV macros that support systems with multiple
coherency domains.  The macros used for referencing global MMR
(chipset registers) are failing to correctly "or" the NASID
(node identifier) bits that reside above M+N. These high bits
are supplied automatically by the chipset for memory accesses
coming from the processor socket.

However, the bits must be present for references to the special
global MMR space used to map chipset registers. (See uv_hub.h
for more details ...)

The bug results in references to invalid/incorrect nodes.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: <stable@kernel.org>
LKML-Reference: <20090608154405.GA16395@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-08 18:57:47 +02:00
Linus Torvalds
46056be71c Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  MIPS: Outline udelay and fix a few issues.
  MIPS: ioctl.h: Fix headers_check warnings
  MIPS: Cobalt: PCI bus is always required to obtain the board ID
  MIPS: Kconfig: Remove "Support for" from Cavium system type
  MIPS: Sibyte: Honor CONFIG_CMDLINE
  SSB: BCM47xx: Export ssb_watchdog_timer_set
2009-06-08 09:22:53 -07:00
Ralf Baechle
5636919b5c MIPS: Outline udelay and fix a few issues.
Outlining fixes the issue were on certain CPUs such as the R10000 family
the delay loop would need an extra cycle if it overlaps a cacheline
boundary.

The rewrite also fixes build errors with GCC 4.4 which was changed in
way incompatible with the kernel's inline assembly.

Relying on pure C for computation of the delay value removes the need for
explicit.  The price we pay is a slight slowdown of the computation - to
be fixed on another day.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-06-08 16:57:51 +01:00
Jaswinder Singh Rajput
3a553147ea MIPS: ioctl.h: Fix headers_check warnings
Make ioctl.h compatible with asm-generic/ioctl.h and userspace

fix the following 'make headers_check' warning:

  usr/include/asm-mips/ioctl.h:64: extern's make no sense in userspace

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-06-08 16:57:51 +01:00
Yoichi Yuasa
e25bfc9243 MIPS: Cobalt: PCI bus is always required to obtain the board ID
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-06-08 16:57:50 +01:00
Yoichi Yuasa
c9d89d97f0 MIPS: Kconfig: Remove "Support for" from Cavium system type
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-06-08 16:57:50 +01:00
Ralf Baechle
e082f188f7 MIPS: Sibyte: Honor CONFIG_CMDLINE
Original patch by Imre Kaloz <kaloz@openwrt.org>.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-06-08 16:57:50 +01:00
Linus Torvalds
6025974bab Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 5543/1: arm: serial amba: add missing declaration in serial.h
  [ARM] pxa: fix pxa27x_udc default pullup GPIO
  [ARM] pxa/imote2: fix UCAM sensor board ADC model number
  mx[23]: don't put clock lookups in __initdata
  fix oops when using console=ttymxcN with N > 0
  [ARM] ARMv7 errata: only apply fixes when running on applicable CPU
  [ARM] 5534/1: kmalloc must return a cache line aligned buffer
2009-06-08 08:29:31 -07:00
Ingo Molnar
1123e3ad73 perf_counter: Clean up x86 boot messages
Standardize and tidy up all the messages we print during
perfcounter initialization.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-08 12:29:30 +02:00
Thomas Gleixner
ad68922061 perf_counter, x86: Implement generalized cache event types, add Atom support
Fill in core2_hw_cache_event_id[] with the Atom model specific events.

The events can be used in all the tools via the -e (--event) parameter,
for example "-e l1-misses" or -"-e l2-accesses" or "-e l2-write-misses".

( Note: these are straight from the Intel manuals - not tested yet.)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-08 11:18:27 +02:00
Thomas Gleixner
0312af8416 perf_counter, x86: Implement generalized cache event types, add Core2 support
Fill in core2_hw_cache_event_id[] with the Core2 model specific events.

The events can be used in all the tools via the -e (--event) parameter,
for example "-e l1-misses" or -"-e l2-accesses" or "-e l2-write-misses".

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-08 11:18:26 +02:00
Figo.zhang
aeef50bc04 x86, microcode: Simplify vfree() use
vfree() does its own 'NULL' check, so no need for check before
calling it.

In v2, remove the stray newline.

[ Impact: cleanup ]

Signed-off-by: Figo.zhang <figo1802@gmail.com>
Cc: Dmitry Adamushko <dmitry.adamushko@gmail.com>
LKML-Reference: <1244385036.3402.11.camel@myhost>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-07 16:35:11 +02:00
Lubomir Rintel
3aa6b186f8 x86: Fix non-lazy GS handling in sys_vm86()
This fixes a stack corruption panic or null dereference oops
due to a bad GS in resume_userspace() when returning from
sys_vm86() and calling lockdep_sys_exit().

Only a problem when CONFIG_LOCKDEP and CONFIG_CC_STACKPROTECTOR
enabled.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Cc: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <1244384628.2323.4.camel@bimbo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-07 16:31:23 +02:00
Cyrill Gorcunov
a4046f8d29 x86, nmi: Use predefined numbers instead of hardcoded one
[ Impact: cleanup ]

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <20090607081937.GC4547@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-07 16:22:02 +02:00
Cyrill Gorcunov
103428e57b x86, apic: Fix dummy apic read operation together with broken MP handling
Ingo Molnar reported that read_apic is buggy novadays:

[    0.000000] Using APIC driver default
[    0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
[    0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
[    0.000000] APIC: disable apic facility
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: at arch/x86/kernel/apic/apic.c:254 native_apic_read_dummy+0x2d/0x3b()
[    0.000000] Hardware name: HP OmniBook PC

Indeed we still rely on apic->read operation for SMP compiled
kernel. And instead of disfigure the SMP code with #ifdef we
allow to call apic->read. To capture any unexpected results
we check for apic->read being called for sane reason via
WARN_ON_ONCE but(!) instead of OR we should use AND logical
operation (thanks Yinghai for spotting the root of the problem).

Along with that we could be have bad MP table and we are
to fix it that way no SMP started and no complains about
BIOS bug if apic was just disabled via command line.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090607124840.GD4547@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-07 16:08:05 +02:00
Jean Delvare
4a4aca641b x86: Add quirk for reboot stalls on a Dell Optiplex 360
The Dell Optiplex 360 hangs on reboot, just like the Optiplex 330, so
the same quirk is needed.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Steve Conklin <steve.conklin@canonical.com>
Cc: Leann Ogasawara <leann.ogasawara@canonical.com>
Cc: <stable@kernel.org>
LKML-Reference: <200906051202.38311.jdelvare@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-07 15:51:20 +02:00
Jaswinder Singh Rajput
5095f59bda x86: cpu_debug: Remove model information to reduce encoding-decoding
Remove model information, encoding/decoding and reduce bookkeeping.

This, besides removing a lot of code and cleaning up the code, also
enables these features on many more CPUs that were enumerated before.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
LKML-Reference: <1244224637.8212.6.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-07 12:22:56 +02:00
Ingo Molnar
5f4457a4f6 Merge branch 'linus' into x86/cpu 2009-06-07 12:22:15 +02:00
Ingo Molnar
56fdd18c7b Merge branch 'linus' into core/iommu
Merge reason: This branch was on an -rc5 base so pull almost-2.6.30
              to resync with the latest upstream fixes and make sure
              the combination works fine.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-07 11:35:05 +02:00
Linus Torvalds
ccc0d38ec1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86/pci: fix mmconfig detection with 32bit near 4g
  PCI: use fixed-up device class when configuring device
2009-06-06 14:33:54 -07:00
Ingo Molnar
75b5032212 Merge branch 'linus' into perfcounters/core
Merge reason: Pick up the latest fixes before the -v8 perfcounters
	      release.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-06 20:21:28 +02:00
Ingo Molnar
8326f44da0 perf_counter: Implement generalized cache event types
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.

This is a 3-dimensional space:

       { L1-D, L1-I, L2, ITLB, DTLB, BPU } x
       { load, store, prefetch } x
       { accesses, misses }

User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)

Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.

Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.

( x86 is supported for now, with the Nehalem event table filled in,
  and with Core2 and Atom having placeholder tables. )

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-06 13:14:47 +02:00
Ingo Molnar
a21ca2cac5 perf_counter: Separate out attr->type from attr->config
Counter type is a frequently used value and we do a lot of
bit juggling by encoding and decoding it from attr->config.

Clean this up by creating a separate attr->type field.

Also clean up the various similarly complex user-space bits
all around counter attribute management.

The net improvement is significant, and it will be easier
to add a new major type (which is what triggered this cleanup).

(This changes the ABI, all tools are adapted.)
(PowerPC build-tested.)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-06 11:37:22 +02:00
Mark Langsdorf
fe2245c905 x86: enable GART-IOMMU only after setting up protection methods
The current code to set up the GART as an IOMMU enables GART
translations before it removes the aperture from the kernel memory
map, sets the GART PTEs to UC, sets up the guard and scratch
pages, or does a wbinvd().  This leaves the possibility of cache
aliasing open and can cause system crashes.

Re-order the code so as to enable the GART translations only
after all safeguards are in place and the tlb has been flushed.

AMD has tested this patch on both Istanbul systems and 1st
generation Opteron systems with APG enabled and seen no adverse
effects.  Istanbul systems with HT Assist enabled sometimes
see MCE errors due to cache artifacts with the unmodified
code.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Cc: <stable@kernel.org>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: akpm@linux-foundation.org
Cc: jbarnes@virtuousgeek.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-06 09:42:09 +02:00
Dave Jones
2c701b1028 [CPUFREQ] powernow-k8: check space_id of _PCT registers to be FFH
The powernow-k8 driver checks to see that the Performance Control/Status
Registers are declared as FFH (functional fixed hardware) by the BIOS.
However, this check got broken in the commit:
 0e64a0c982
 [CPUFREQ] checkpatch cleanups for powernow-k8

Fix based on an original patch from Naga Chumbalkar.

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Cc: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-06-05 13:25:25 -04:00
Peter Zijlstra
f7b6eb3fa0 x86: Set context.vdso before installing the mapping
In order to make arch_vma_name() work from inside
install_special_mapping() we need to set the context.vdso
before calling it.

( This is needed for performance counters to be able to track
  this special executable area. )

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-05 14:46:40 +02:00
Rusty Russell
2cb7878a3a lguest: fix 'unhandled trap 13' with CONFIG_CC_STACKPROTECTOR
We don't set up the canary; let's disable stack protector on boot.c so
we can get into lguest_init, then set it up.  As a side effect,
switch_to_new_gdt() sets up %fs for us properly too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-04 11:50:06 -07:00
Russell King
754c0f9a95 Merge branch 'fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ycmiao/pxa-linux-2.6 2009-06-04 17:02:58 +01:00
Russell King
947ca2e983 Merge branch 'for-rmk' of git://git.pengutronix.de/git/imx/linux-2.6 2009-06-04 12:27:18 +01:00
Magnus Damm
48c72fccbf sh: 16-bit get_unaligned() sh4a fix
This patch fixes the 16-bit case of the sh4a specific
unaligned access implementation. Without this patch
the 16-bit version of sh4a get_unaligned() results in
a 32-bit read which may read more data than intended
and/or cross page boundaries.

Unbreaks mtd NOR write handling on Migo-R.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-06-04 20:20:24 +09:00
Paul Mackerras
1b58c2515b perf_counter: powerpc: Use new identifier names in powerpc-specific code
Commit b23f3325 ("perf_counter: Rename various fields") fixed up
most of the uses of the renamed fields, but missed one instance
of "record_type" in powerpc-specific code which needs to be changed
to "sample_type", and a "PERF_RECORD_ADDR" in the same statement that
needs to be changed to "PERF_SAMPLE_ADDR", causing compilation
errors on powerpc.  This fixes it.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18983.3111.770392.800486@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-04 13:20:11 +02:00
Yinghai Lu
75e613cdc7 x86/pci: fix mmconfig detection with 32bit near 4g
Pascal reported and bisected a commit:
|	x86/PCI: don't call e820_all_mapped with -1 in the mmconfig case

which broke one system system.

ACPI: Using IOAPIC for interrupt routing
PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 255
PCI: MCFG area at f0000000 reserved in ACPI motherboard resources
PCI: Using MMCONFIG for extended config space

it didn't have
PCI: updated MCFG configuration 0: base f0000000 segment 0 buses 0 - 63
anymore, and try to use 0xf000000 - 0xffffffff for mmconfig

For 32bit, mcfg_res->end could be 32bit only (if 64 resources aren't used)
So use end - 1 to pass the value in mcfg->end to avoid overflow.

We don't need to worry about the e820 path, they are always 64 bit.

Reported-by: Pascal Terjan <pterjan@mandriva.com>
Bisected-by: Pascal Terjan <pterjan@mandriva.com>
Tested-by: Pascal Terjan <pterjan@mandriva.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-04 11:31:13 +01:00
Philipp Zabel
1257629b07 [ARM] pxa: fix pxa27x_udc default pullup GPIO
Currently, pxa27x_udc tries to use GPIO 0 as D+ pullup if not
explicitly configured. Default to an invalid GPIO (-1) instead.

Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
2009-06-04 11:06:25 +08:00
Jonathan Cameron
d81e77f041 [ARM] pxa/imote2: fix UCAM sensor board ADC model number
Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
2009-06-04 11:06:25 +08:00
Ingo Molnar
128f048f0f perf_counter: Fix throttling lock-up
Throttling logic is broken and we can lock up with too small
hw sampling intervals.

Make the throttling code more robust: disable counters even
if we already disabled them.

( Also clean up whitespace damage i noticed while reading
  various pieces of code related to throttling. )

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-03 23:39:51 +02:00
Cliff Wickman
0e2595cdfd x86: Fix UV BAU activation descriptor init
The UV tlb shootdown code has a serious initialization error.

An array of structures [32*8] is initialized as if it were [32].
The array is indexed by (cpu number on the blade)*8, so the short
initialization works for up to 4 cpus on a blade.
But above that, we provide an invalid opcode to the hub's
broadcast assist unit.

This patch changes the allocation of the array to use its symbolic
dimensions for better clarity. And initializes all 32*8 entries.

Shortened 'UV_ACTIVATION_DESCRIPTOR_SIZE' to 'UV_ADP_SIZE' per Ingo's
recommendation.

Tested on the UV simulator.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: <stable@kernel.org>
LKML-Reference: <E1M6lZR-0007kV-Aq@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-03 13:07:31 +02:00
Rabin Vincent
6b4bfb87b6 mx[23]: don't put clock lookups in __initdata
Remove the __initdata annotation for the clock lookups, since they will
be needed when loading modules which use clk_get().

Tested-by: Agustín Ferrín Pozuelo <gatoguan-os@yahoo.com>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2009-06-03 11:51:06 +02:00
Paul Mackerras
dcd945e0d8 perf_counter: powerpc: Fix race causing "oops trying to read PMC0" errors
When using interrupting counters and limited (non-interrupting)
counters at the same time, it's possible that we get an
interrupt in write_mmcr0() after writing MMCR0 but before we
have set up the counters using limited PMCs.  What happens then
is that we get into perf_counter_interrupt() with
counter->hw.idx = 0 for the limited counters, leading to the
"oops trying to read PMC0" error message being printed.

This fixes the problem by making perf_counter_interrupt()
robust against counter->hw.idx being zero (the counter is just
ignored in that case) and also by changing write_mmcr0() to
write MMCR0 initially with the counter overflow interrupt
enable bits masked (set to 0).  If the MMCR0 value requested by
the caller has either of those bits set, we write MMCR0 again
with the requested value of those bits after setting up the
limited counters properly.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Cc: Stephane Eranian <eranian@googlemail.com>
LKML-Reference: <18982.17684.138182.954599@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-03 11:49:53 +02:00
Paul Mackerras
6984efb692 perf_counter: powerpc: Fix event alternative code generation on POWER5/5+
Commit ef923214 ("perf_counter: powerpc: use u64 for event
codes internally") introduced a bug where the return value from
function find_alternative_bdecode gets put into a u64 variable
and later tested to see if it is < 0.  The effect is that we
get extra, bogus event code alternatives on POWER5 and POWER5+,
leading to error messages such as "oops compute_mmcr failed"
being printed and counters not counting properly.

This fixes it by using s64 for the return type of
find_alternative_bdecode and for the local variable that the
caller puts the value in.  It also makes the event argument a
u64 on POWER5+ for consistency.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Cc: Stephane Eranian <eranian@googlemail.com>
LKML-Reference: <18982.17586.666132.90983@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-03 11:49:52 +02:00
Jiri Slaby
367d04c4ec amd_iommu: fix lock imbalance
In alloc_coherent there is an omitted unlock on the path where mapping
fails. Add the unlock.

[ Impact: fix lock imbalance in alloc_coherent ]

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-03 10:34:55 +02:00