Commit Graph

41545 Commits (0d782dc430d94dc36b47cb11c2e33ecb1bb38234)

Author SHA1 Message Date
John Kacur 87fbaf6aea m68k: Remove the BKL from sys_execve
This seems like a copy-and-paste from code that no-longer needs the BKL
Just remove it.

Signed-off-by: John Kacur <jkacur@redhat.com>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2009-12-06 11:18:26 +01:00
Tim Abbott 7c5fd5619d m68k: Cleanup linker scripts using new linker script macros.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Tested-by: Andreas Schwab <schwab@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2009-12-06 11:18:25 +01:00
Tim Abbott 5cdef24b2a m68k: Make thread_info.h usable from assembly.
[Geert] <asm/thread_info_mm.h> pulls in <asm/current.h>, which contains C only.
So the include must be moved inside #ifndef __ASSEMBLY__.

Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: linux-m68k@lists.linux-m68k.org
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2009-12-06 11:18:24 +01:00
Greg Ungerer f60a557267 m68knommu: define arch_has_single_step() and friends
Towards adding CONFIG_UTRACE support for non-mmu m68k add
arch_has_single_step, and its support functions user_enable_single_step()
and user_disable_single_step().
[Geert] m68k conflict resolution from linux-next

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2009-12-06 11:18:23 +01:00
Shaun Patterson 8055039c2a x86: Fix typo in arch/x86/mm/kmmio.c
Signed-off-by: Shaun Patterson <shaunpatterson@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: pq@iki.fi
LKML-Reference: <1260027694.10074.170.camel@linux-4lgc.site>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-06 09:23:40 +01:00
Frederic Weisbecker af2d8289f5 x86: Fixup wrong irq frame link in stacktraces
When we enter in irq, two things can happen to preserve the link
to the previous frame pointer:

- If we were in an irq already, we don't switch to the irq stack
  as we are inside. We just need to save the previous frame
  pointer and to link the new one to the previous.

- Otherwise we need another level of indirection. We enter the irq with
  the previous stack. We save the previous bp inside and make bp
  pointing to its saved address. Then we switch to the irq stack and
  push bp another time but to the new stack. This makes two levels to
  dereference instead of one.

In the second case, the current stacktrace code omits the second level
and loses the frame pointer accuracy. The stack that follows will then
be considered as unreliable.

Handling that makes the perf callchain happier.
Before:

43.94%  [k] _raw_read_lock
            |
            --- _read_lock
               |
               |--60.53%-- send_sigio
               |          __kill_fasync
               |          kill_fasync
               |          evdev_pass_event
               |          evdev_event
               |          input_pass_event
               |          input_handle_event
               |          input_event
               |          synaptics_process_byte
               |          psmouse_handle_byte
               |          psmouse_interrupt
               |          serio_interrupt
               |          i8042_interrupt
               |          handle_IRQ_event
               |          handle_edge_irq
               |          handle_irq
               |          __irqentry_text_start
               |          ret_from_intr
               |          |
               |          |--30.43%-- __select
               |          |
               |          |--17.39%-- 0x454f15
               |          |
               |          |--13.04%-- __read
               |          |
               |          |--13.04%-- vread_hpet
               |          |
               |          |--13.04%-- _xcb_lock_io
               |          |
               |           --13.04%-- 0x7f630878ce8

After:

    50.00%  [k] _raw_read_lock
            |
            --- _read_lock
               |
               |--98.97%-- send_sigio
               |          __kill_fasync
               |          kill_fasync
               |          evdev_pass_event
               |          evdev_event
               |          input_pass_event
               |          input_handle_event
               |          input_event
               |          |
               |          |--96.88%-- synaptics_process_byte
               |          |          psmouse_handle_byte
               |          |          psmouse_interrupt
               |          |          serio_interrupt
               |          |          i8042_interrupt
               |          |          handle_IRQ_event
               |          |          handle_edge_irq
               |          |          handle_irq
               |          |          __irqentry_text_start
               |          |          ret_from_intr
               |          |          |
               |          |          |--39.78%-- __const_udelay
               |          |          |          |
               |          |          |          |--91.89%-- ath5k_hw_register_timeout
               |          |          |          |          ath5k_hw_noise_floor_calibration
               |          |          |          |          ath5k_hw_reset
               |          |          |          |          ath5k_reset
               |          |          |          |          ath5k_config
               |          |          |          |          ieee80211_hw_config
               |          |          |          |          |
               |          |          |          |          |--88.24%-- ieee80211_scan_work
               |          |          |          |          |          worker_thread
               |          |          |          |          |          kthread
               |          |          |          |          |          child_rip
               |          |          |          |          |
               |          |          |          |           --11.76%-- ieee80211_scan_completed
               |          |          |          |                     ieee80211_scan_work
               |          |          |          |                     worker_thread
               |          |          |          |                     kthread
               |          |          |          |                     child_rip
               |          |          |          |
               |          |          |           --8.11%-- ath5k_hw_noise_floor_calibration
               |          |          |                     ath5k_hw_reset
               |          |          |                     ath5k_reset
               |          |          |                     ath5k_config

Note: This does not only affect perf events but also x86-64
stacktraces. They were considered as unreliable once we quit
the irq stack frame.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2009-12-06 08:27:24 +01:00
Frederic Weisbecker b625b3b3b7 x86: Fixup wrong debug exception frame link in stacktraces
While dumping a stacktrace, the end of the exception stack won't link
the frame pointer to the previous stack.

The interrupted stack will then be considered as unreliable and ignored
by perf, as the frame pointer is unreliable itself.

This happens because we overwrite the frame pointer that links to the
interrupted frame with the address of the exception stack. This is
done in order to reserve space inside.
But rbp has been chosen here only because it is not a scratch register,
so that the address of the exception stack remains in rbp after calling
do_debug(), we can then release the exception stack space without the
need to retrieve its address again.

But we can pick another non-scratch register to do that, so that we
preserve the link to the interrupted stack frame in the stacktraces.

Just randomly choose r12. Every registers are saved just before and
restored just after calling do_debug(). And r12 is not used in the
middle, which makes it a perfect candidate.

Example: perf record -g -a -c 1 -f -e mem:$(tasklist_lock_addr):rw

Before:
    44.18%  [k] _raw_read_lock
            |
            |
            ---  |--6.31%-- waitid
                 |
                 |--4.26%-- writev
                 |
                 |--3.63%-- __select
                 |
                 |--3.15%-- __waitpid
                 |          |
                 |          |--28.57%-- 0x8b52e00000139f
                 |          |
                 |          |--28.57%-- 0x8b52e0000013c6
                 |          |
                 |          |--14.29%-- 0x7fde786dc000
                 |          |
                 |          |--14.29%-- 0x62696c2f7273752f
                 |          |
                 |           --14.29%-- 0x1ea9df800000000
                 |
                 |--3.00%-- __poll

After:

    43.94%  [k] _raw_read_lock
            |
            --- _read_lock
               |
               |--60.53%-- send_sigio
               |          __kill_fasync
               |          kill_fasync
               |          evdev_pass_event
               |          evdev_event
               |          input_pass_event
               |          input_handle_event
               |          input_event
               |          synaptics_process_byte
               |          psmouse_handle_byte
               |          psmouse_interrupt
               |          serio_interrupt
               |          i8042_interrupt
               |          handle_IRQ_event
               |          handle_edge_irq
               |          handle_irq
               |          __irqentry_text_start
               |          ret_from_intr
               |          |
               |          |--30.43%-- __select
               |          |
               |          |--17.39%-- 0x454f15
               |          |
               |          |--13.04%-- __read
               |          |
               |          |--13.04%-- vread_hpet
               |          |
               |          |--13.04%-- _xcb_lock_io
               |          |
               |           --13.04%-- 0x7f630878ce87

Note: it does not only affect perf events but also other stacktraces in
x86-64. They were considered as unreliable once we quit the debug
stack frame.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2009-12-06 08:27:22 +01:00
Frederic Weisbecker 7f33f9c5cc x86/perf: Exclude the debug stack from the callchains
Dumping the callchains from breakpoint events with perf gives strange
results:

3.75%             perf  [kernel]           [k] _raw_read_unlock
                       |
                       --- _raw_read_unlock
                           perf_callchain
                           perf_prepare_sample
                           __perf_event_overflow
                           perf_swevent_overflow
                           perf_swevent_add
                           perf_bp_event
                           hw_breakpoint_exceptions_notify
                           notifier_call_chain
                           __atomic_notifier_call_chain
                           atomic_notifier_call_chain
                           notify_die
                           do_debug
                           debug
                           munmap

We are infected with all the debug stack. Like the nmi stack, the debug
stack is undesired as it is part of the profiling path, not helpful for
the user.

Ignore it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
2009-12-06 08:27:21 +01:00
Frederic Weisbecker b326e9560a hw-breakpoints: Use overflow handler instead of the event callback
struct perf_event::event callback was called when a breakpoint
triggers. But this is a rather opaque callback, pretty
tied-only to the breakpoint API and not really integrated into perf
as it triggers even when we don't overflow.

We prefer to use overflow_handler() as it fits into the perf events
rules, being called only when we overflow.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
2009-12-06 08:27:18 +01:00
Frederic Weisbecker 2f0993e0fb hw-breakpoints: Drop callback and task parameters from modify helper
Drop the callback and task parameters from modify_user_hw_breakpoint().
For now we have no user that need to modify a breakpoint to the point
of changing its handler or its task context.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
2009-12-06 08:27:17 +01:00
David S. Miller 7f5620a5fc sparc: Set UTS_MACHINE correctly.
"ARCH" can be just about anything, so we shouldn't end up
with UTS_MACHINE of "sparc" in a 64-bit kernel build just
because someone set the personality using 'sparc32' or
similar.  CONFIG_SPARC64 drives the compilation and
therefore provides the definitive value, not "ARCH".

This mirrors commit 8c6531f7a9
(x86: correctly set UTS_MACHINE for "make ARCH=x86")

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-05 17:17:55 -08:00
Linus Torvalds 6ec22f9b03 Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Limit number of per cpu TSC sync messages
  x86: dumpstack, 64-bit: Disable preemption when walking the IRQ/exception stacks
  x86: dumpstack: Clean up the x86_stack_ids[][] initalization and other details
  x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes
  x86: Suppress stack overrun message for init_task
  x86: Fix cpu_devs[] initialization in early_cpu_init()
  x86: Remove CPU cache size output for non-Intel too
  x86: Minimise printk spew from per-vendor init code
  x86: Remove the CPU cache size printk's
  cpumask: Avoid cpumask_t in arch/x86/kernel/apic/nmi.c
  x86: Make sure we also print a Code: line for show_regs()
2009-12-05 15:33:27 -08:00
Linus Torvalds 83be7d764d Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, msr, cpumask: Use struct cpumask rather than the deprecated cpumask_t
  x86, cpuid: Simplify the code in cpuid_open
  x86, cpuid: Remove the bkl from cpuid_open()
  x86, msr: Remove the bkl from msr_open()
  x86: AMD Geode LX optimizations
  x86, msr: Unify rdmsr_on_cpus/wrmsr_on_cpus
2009-12-05 15:32:35 -08:00
Linus Torvalds c2ed69cdc9 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix a section mismatch in arch/x86/kernel/setup.c
  x86: Fixup last users of irq_chip->typename
  x86: Remove BKL from apm_32
  x86: Remove BKL from microcode
  x86: use kernel_stack_pointer() in kprobes.c
  x86: use kernel_stack_pointer() in kgdb.c
  x86: use kernel_stack_pointer() in dumpstack.c
  x86: use kernel_stack_pointer() in process_32.c
2009-12-05 15:32:18 -08:00
Linus Torvalds ef26b1691d Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  include/linux/compiler-gcc4.h: Fix build bug - gcc-4.0.2 doesn't understand __builtin_object_size
  x86/alternatives: No need for alternatives-asm.h to re-invent stuff already in asm.h
  x86/alternatives: Check replacementlen <= instrlen at build time
  x86, 64-bit: Set data segments to null after switching to 64-bit mode
  x86: Clean up the loadsegment() macro
  x86: Optimize loadsegment()
  x86: Add missing might_fault() checks to copy_{to,from}_user()
  x86-64: __copy_from_user_inatomic() adjustments
  x86: Remove unused thread_return label from switch_to()
  x86, 64-bit: Fix bstep_iret jump
  x86: Don't use the strict copy checks when branch profiling is in use
  x86, 64-bit: Move K8 B step iret fixup to fault entry asm
  x86: Generate cmpxchg build failures
  x86: Add a Kconfig option to turn the copy_from_user warnings into errors
  x86: Turn the copy_from_user check into an (optional) compile time warning
  x86: Use __builtin_memset and __builtin_memcpy for memset/memcpy
  x86: Use __builtin_object_size() to validate the buffer size for copy_from_user()
2009-12-05 15:32:03 -08:00
Linus Torvalds a77d2e081b Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
  x86, apic: Enable lapic nmi watchdog on AMD Family 11h
  x86: Remove unnecessary mdelay() from cpu_disable_common()
  x86, ioapic: Document another case when level irq is seen as an edge
  x86, ioapic: Fix the EOI register detection mechanism
  x86, io-apic: Move the effort of clearing remoteIRR explicitly before migrating the irq
  x86: SGI UV: Map low MMR ranges
  x86: apic: Print out SRAT table APIC id in hex
  x86: Re-get cfg_new in case reuse/move irq_desc
  x86: apic: Remove not needed #ifdef
  x86: io-apic: IO-APIC MMIO should not fail on resource insertion
  x86: Remove asm/apicnum.h
  x86: apic: Do not use stacked physid_mask_t
  x86, apic: Get rid of apicid_to_cpu_present assign on 64-bit
  x86, ioapic: Use snrpintf while set names for IO-APIC resourses
  x86, apic: Use PAGE_SIZE instead of numbers
  x86: Remove local_irq_enable()/local_irq_disable() in fixup_irqs()
  x86: Use EOI register in io-apic on intel platforms
  x86: Force irq complete move during cpu offline
  x86: Remove move_cleanup_count from irq_cfg
  x86, intr-remap: Avoid irq_chip mask/unmask in fixup_irqs() for intr-remapping
  ...
2009-12-05 15:31:25 -08:00
Linus Torvalds c3fa27d136 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (470 commits)
  x86: Fix comments of register/stack access functions
  perf tools: Replace %m with %a in sscanf
  hw-breakpoints: Keep track of user disabled breakpoints
  tracing/syscalls: Make syscall events print callbacks static
  tracing: Add DEFINE_EVENT(), DEFINE_SINGLE_EVENT() support to docbook
  perf: Don't free perf_mmap_data until work has been done
  perf_event: Fix compile error
  perf tools: Fix _GNU_SOURCE macro related strndup() build error
  trace_syscalls: Remove unused syscall_name_to_nr()
  trace_syscalls: Simplify syscall profile
  trace_syscalls: Remove duplicate init_enter_##sname()
  trace_syscalls: Add syscall_nr field to struct syscall_metadata
  trace_syscalls: Remove enter_id exit_id
  trace_syscalls: Set event_enter_##sname->data to its metadata
  trace_syscalls: Remove unused event_syscall_enter and event_syscall_exit
  perf_event: Initialize data.period in perf_swevent_hrtimer()
  perf probe: Simplify event naming
  perf probe: Add --list option for listing current probe events
  perf probe: Add argv_split() from lib/argv_split.c
  perf probe: Move probe event utility functions to probe-event.c
  ...
2009-12-05 15:30:21 -08:00
David S. Miller 28b4d5cc17 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	drivers/net/pcmcia/fmvj18x_cs.c
	drivers/net/pcmcia/nmclan_cs.c
	drivers/net/pcmcia/xirc2ps_cs.c
	drivers/net/wireless/ray_cs.c
2009-12-05 15:22:26 -08:00
Linus Torvalds 96fa2b508d Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (40 commits)
  tracing: Separate raw syscall from syscall tracer
  ring-buffer-benchmark: Add parameters to set produce/consumer priorities
  tracing, function tracer: Clean up strstrip() usage
  ring-buffer benchmark: Run producer/consumer threads at nice +19
  tracing: Remove the stale include/trace/power.h
  tracing: Only print objcopy version warning once from recordmcount
  tracing: Prevent build warning: 'ftrace_graph_buf' defined but not used
  ring-buffer: Move access to commit_page up into function used
  tracing: do not disable interrupts for trace_clock_local
  ring-buffer: Add multiple iterations between benchmark timestamps
  kprobes: Sanitize struct kretprobe_instance allocations
  tracing: Fix to use __always_unused attribute
  compiler: Introduce __always_unused
  tracing: Exit with error if a weak function is used in recordmcount.pl
  tracing: Move conditional into update_funcs() in recordmcount.pl
  tracing: Add regex for weak functions in recordmcount.pl
  tracing: Move mcount section search to front of loop in recordmcount.pl
  tracing: Fix objcopy revision check in recordmcount.pl
  tracing: Check absolute path of input file in recordmcount.pl
  tracing: Correct the check for number of arguments in recordmcount.pl
  ...
2009-12-05 09:53:36 -08:00
Linus Torvalds 3e72b810e3 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  mutex: Fix missing conditions to build mutex_spin_on_owner()
  mutex: Better control mutex adaptive spinning config
  locking, task_struct: Reduce size on TRACE_IRQFLAGS and 64bit
  locking: Use __[SPIN|RW]_LOCK_UNLOCKED in [spin|rw]_lock_init()
  locking: Remove unused prototype
  locking: Reduce ifdefs in kernel/spinlock.c
  locking: Make inlining decision Kconfig based
2009-12-05 09:49:59 -08:00
Linus Torvalds 7b626acb8f Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
  x86, Calgary IOMMU quirk: Find nearest matching Calgary while walking up the PCI tree
  x86/amd-iommu: Remove amd_iommu_pd_table
  x86/amd-iommu: Move reset_iommu_command_buffer out of locked code
  x86/amd-iommu: Cleanup DTE flushing code
  x86/amd-iommu: Introduce iommu_flush_device() function
  x86/amd-iommu: Cleanup attach/detach_device code
  x86/amd-iommu: Keep devices per domain in a list
  x86/amd-iommu: Add device bind reference counting
  x86/amd-iommu: Use dev->arch->iommu to store iommu related information
  x86/amd-iommu: Remove support for domain sharing
  x86/amd-iommu: Rearrange dma_ops related functions
  x86/amd-iommu: Move some pte allocation functions in the right section
  x86/amd-iommu: Remove iommu parameter from dma_ops_domain_alloc
  x86/amd-iommu: Use get_device_id and check_device where appropriate
  x86/amd-iommu: Move find_protection_domain to helper functions
  x86/amd-iommu: Simplify get_device_resources()
  x86/amd-iommu: Let domain_for_device handle aliases
  x86/amd-iommu: Remove iommu specific handling from dma_ops path
  x86/amd-iommu: Remove iommu parameter from __(un)map_single
  x86/amd-iommu: Make alloc_new_range aware of multiple IOMMUs
  ...
2009-12-05 09:49:07 -08:00
David Daney 27d16d0871 avr32: Convert BUG() to use unreachable()
Use the new unreachable() macro instead of for(;;);

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-05 09:10:12 -08:00
David Daney 5506e68975 s390: Convert BUG() to use unreachable()
Use the new unreachable() macro instead of for(;;);

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: Heiko Carstens <heiko.carstens@de.ibm.com>
CC: linux390@de.ibm.com
CC: linux-s390@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-05 09:10:12 -08:00
David Daney 4ef5651e85 MIPS: Convert BUG() to use unreachable()
Use the new unreachable() macro instead of while(1);

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
CC: linux-mips@linux-mips.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-05 09:10:12 -08:00
David Daney a5fc5eba4d x86: Convert BUG() to use unreachable()
Use the new unreachable() macro instead of for(;;);.  When
allyesconfig is built with a GCC-4.5 snapshot on i686 the size of the
text segment is reduced by 3987 bytes (from 6827019 to 6823032).

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: x86@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-05 09:10:12 -08:00
Russell King 0719dc3413 Merge branch 'devel-stable' into devel 2009-12-05 10:35:33 +00:00
Russell King e28edb723e Merge branches 'at91', 'ep93xx', 'etm', 'ks8695', 'nuc', 'u300' and 'u8500' into devel 2009-12-05 10:35:18 +00:00
Leann Ogasawara 4832ddda2e x86: ASUS P4S800 reboot=bios quirk
Bug reporter noted their system with an ASUS P4S800 motherboard would
hang when rebooting unless reboot=b was specified.  Their dmidecode
didn't contain descriptive System Information for Manufacturer or
Product Name, so I used their Base Board Information to create a
reboot quirk patch.  The bug reporter confirmed this patch resolves
the reboot hang.

Handle 0x0001, DMI type 1, 25 bytes
System Information
       Manufacturer: System Manufacturer
       Product Name: System Name
       Version: System Version
       Serial Number: SYS-1234567890
       UUID: E0BFCD8B-7948-D911-A953-E486B4EEB67F
       Wake-up Type: Power Switch

Handle 0x0002, DMI type 2, 8 bytes
Base Board Information
     Manufacturer: ASUSTeK Computer INC.
     Product Name: P4S800
     Version: REV 1.xx
     Serial Number: xxxxxxxxxxx

BugLink: http://bugs.launchpad.net/bugs/366682

ASUS P4S800 will hang when rebooting unless reboot=b is specified.
Add a quirk to reboot through the bios.

Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
LKML-Reference: <1259972107.4629.275.camel@emiko>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org>
2009-12-04 16:38:59 -08:00
Chris Wright 5d990b6275 PCI: add pci_request_acs
Commit ae21ee65e8 "PCI: acs p2p upsteram
forwarding enabling" doesn't actually enable ACS.

Add a function to pci core to allow an IOMMU to request that ACS
be enabled.  The existing mechanism of using iommu_found() in the pci
core to know when ACS should be enabled doesn't actually work due to
initialization order;  iommu has only been detected not initialized.

Have Intel and AMD IOMMUs request ACS, and Xen does as well during early
init of dom0.

Cc: Allen Kay <allen.m.kay@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 16:19:24 -08:00
Yinghai Lu 575939cf54 x86/PCI: claim SR-IOV BARs in pcibios_allocate_resource
This allows us to use the BIOS SR-IOV allocations rather than assigning
our own later on.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 16:00:17 -08:00
Adam Buchbinder 6070d81eb5 tree-wide: fix misspelling of "definition" in comments
"Definition" is misspelled "defintion" in several comments; this
patch fixes them. No code changes.

Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 23:41:47 +01:00
Andreas Schwab f195e2bff3 m68k: ptrace fixes
This fixes the following issues in ptrace:

- when single stepping into the signal handler stop at the first insn of
  the handler
- handle non-zero stkadj when accessing pc and sr in ptregs
- correctly handle PT_SR in PTRACE_POKEUSR
- report -EIO when trying to read unknown offset in PTRACE_PEEKUSR

Additionally, the handling of the special case that PT_SR accesses a 16
bit word instead of a 32 bit word has been moved into get_reg/put_reg.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2009-12-04 21:22:35 +01:00
Andreas Schwab faa47b4669 m68k: use generic code for ptrace requests
Remove all but PTRACE_{PEEK,POKE}USR and PTRACE_{GET,SET}{REGS,FPREGS}
from arch_ptrace and let the rest be handled by generic code.  Define
PTRACE_SINGLEBLOCK to enable singleblock tracing.
[Geert] Not yet applicable for m68knommu

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2009-12-04 21:22:35 +01:00
Russell King 677f4f64e4 Merge branch 'for-rmk' of git://git.pengutronix.de/git/imx/linux-2.6 into devel-stable 2009-12-04 17:34:50 +00:00
Russell King 4567c4a896 Merge branch 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/ycmiao/pxa-linux-2.6 into devel-stable 2009-12-04 17:34:16 +00:00
Russell King 602fd7c367 Merge branch 'for-rmk' of git://git.fluff.org/bjdooks/linux into devel-stable
Conflicts:
	arch/arm/Kconfig
2009-12-04 17:33:54 +00:00
Takashi Iwai baf9226667 Merge branch 'topic/asoc' into for-linus 2009-12-04 16:22:41 +01:00
Russell King 2fc42814d8 Merge branch 'pending-dma-streaming' (early part) into devel 2009-12-04 15:00:11 +00:00
Russell King c6baa1963c Merge branch 'pending-dma-coherent' into devel 2009-12-04 15:00:00 +00:00
Russell King 5cb2faa6ed Merge branch 'pending-misc' (early part) into devel 2009-12-04 14:59:47 +00:00
Russell King 6060e8df51 ARM: I-cache: flush executable mappings in flush_cache_range()
Dirk Behme reported instability on ARM11 SMP (VIPT non-aliasing cache)
caused by the dynamic linker changing protection on text pages to write
GOT entries.  The problem is due to an interaction between the write
faulting code providing new anonymous pages which are incoherent with
the I-cache due to write buffering, and the I-cache not having been
invalidated.

a4db94d plugs the hole with the data cache coherency.  This patch
provides the other half of the fix by flushing the I-cache in
flush_cache_range() for VM_EXEC VMAs (which is what we have when the
region is being made executable again.)  This ensures that the I-cache
will be up to date with the newly COW'd pages.

Note: if users are writing instructions, then they still need to use
the ARM sys_cacheflush API to ensure that the caches are correctly
synchronized.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-04 14:58:51 +00:00
Russell King ea201dbb78 ARM: I-cache: avoid flushing in flush_cache_mm()
flush_cache_mm() is called in two cases:
1. when a process exits, just before the page tables are torn down.
   We can allow the stale lines to evict themselves over time without
   causing any harm.

2. when a process forks, and we've allocated a new ASID.
   The instruction cache issues are dealt with as pages are brought
   into the new process address space.  Flushing the I-cache here is
   therefore unnecessary.

However, we must keep the VIPT aliasing D-cache flush to ensure that
any dirty cache lines are not written back after the pages have been
reallocated for some other use - which would result in corruption.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-04 14:58:51 +00:00
Russell King 9e95922b10 ARM: I-cache: Add invalidation for VIVT ASID tagged caches
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-04 14:58:51 +00:00
Catalin Marinas 115b22474e ARM: 5794/1: Flush the D-cache during copy_user_highpage()
The I and D caches for copy-on-write pages on processors with
write-allocate caches become incoherent causing problems on application
relying on CoW for text pages (dynamic linker relocating symbols in a
text page). This patch flushes the D-cache for such pages.

Cc: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-04 14:58:50 +00:00
Russell King f91fb05d82 ARM: Remove __flush_icache_all() from __flush_dcache_page()
Both call sites for __flush_dcache_page() end up calling
__flush_icache_all() themselves, so having __flush_dcache_page() do
this as well is wasteful.  Remove the duplicated icache flushing.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-04 14:58:50 +00:00
Russell King 2df341edf6 ARM: Move __flush_icache_all() out of flush_pfn_alias()
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-04 14:58:50 +00:00
Russell King 7b0a1003e7 ARM: Reduce __flush_dcache_page() visibility
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-04 14:58:50 +00:00
André Goddard Rosa af901ca181 tree-wide: fix assorted typos all over the place
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 15:39:55 +01:00
Robert Schwebel fedea672a3 mx31moboard: fix typo
Currently, linux-next breaks due to a typo introduced in commit
33c4d91928

This patch fixes it.

Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de>
Cc: Valentin Longchamp <valentin.longchamp@epfl.ch>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2009-12-04 13:23:41 +01:00
Mark Brown 43f0de8d02 S3C64XX: Staticise platform data for PCM devices
The symbols aren't declared and don't need to be exported, they go
along with the device structure.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Ben Dooks <ben-linux@fluff.org>
2009-12-04 10:46:08 +00:00
Magnus Damm 6a5a0b9139 sh: include empty zero page in romImage
This patch updates the romImage code to include the
empty_zero_page contents from vmlinux. Without this
patch the empty zero page is lef uninitialized.

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-04 16:26:20 +09:00
Paul Mundt 6e8a0d11a0 sh: Make associative cache writes fatal on all SH-4A parts.
Now that associative cache writes are no longer needed by the SH-4/SH-4A
cache flush code, associative write support can be explicitly disabled
for all SH-4A parts. This makes any associative write throw an exception,
as this behaviour can not be assumed to exist on future parts.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-04 16:22:11 +09:00
Matt Fleming a781d1e5ff sh: Drop associative writes for SH-4 cache flushes.
When flushing/invalidating the icache/dcache via the memory-mapped IC/OC
address arrays, the associative bit should only be used in conjunction with
virtual addresses. However, we currently flush cache lines based on physical
address, so stop using the associative bit.

It is a better strategy to use non-associative writes (and physical tags) for
flushing the caches anyway, because flushing by virtual address (as with the
A-bit set) requires a valid TLB entry for that virtual address. If one does not
exist in the TLB no exception is generated and the flush is silently ignored.

This is also future-proofing for SH-4A parts which are gradually phasing out
associative writes to the cache array due to the aforementioned case of certain
flushes silently turning in to nops.

Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-04 16:18:11 +09:00
Paul Mundt 7e01c94998 sh: Partial revert of copy/clear_user_highpage() optimizations.
These still require more testing, so revert them for now. We keep the
off-by-1 in the fixmap colouring and drop the rest.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-04 15:14:52 +09:00
Paul Mundt 8144a7dd51 sh: Add default uImage rule for se7724, ap325rxa, and migor.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-04 13:57:40 +09:00
Magnus Damm a65d0d79c4 sh: allow runtime pm without suspend/resume callbacks
This patch updates the Runtime PM code for SuperH Mobile
to allow drivers to have NULL as pm or callback value.
With this in place there is no need for no-op functions.

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-04 13:42:37 +09:00
Kuninori Morimoto 1c2e36cc9b sh: mach-ecovec24: Remove un-defined settings for VPU
The setting of VPU need not be changed from default.
And current setting value is not defined on SH7724

Reported-by:   Goda Yusuke <goda.yusuke@renesas.com>
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-04 13:42:34 +09:00
Kuninori Morimoto 82b3322178 sh: mach-ecovec24: LCDC drive ability become high
Drive ability for LCDC become high for safety,
became there is strange individual specificity board in mass production

Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-04 13:42:32 +09:00
Magnus Damm 7e213481d6 sh: fix sh7724 VEU3F resource size
Fix one-off VEU3F size error for sh7724.

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-12-04 13:42:29 +09:00
Steven King 96c612427e m68knommu: export clk_* symbols in clk.c
export the clk_*  stubs defined in arch/m68knommu/platform/coldfire/clk.c so
they can be used by modules.

Signed-off-by: Steven King <sfking@fdwdc.com>
Signed-off-by: Greg Ungerer <gerg@goober.(none)>
2009-12-04 11:45:32 +10:00
Tim Abbott 53749f735a m68knommu: Split the .init section into INIT_TEXT_SECTION and INIT_DATA_SECTION.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:32 +10:00
Tim Abbott 995bcd3dc1 m68knommu: Move __init_end out of the .init section.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:32 +10:00
Tim Abbott a90a44ee90 m68knommu: Move __init_begin out of the .init section.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:32 +10:00
Tim Abbott 84bd757155 m68knommu: Use more macros inside the .init section.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:31 +10:00
Tim Abbott 49612a5fa5 m68knommu: Use INIT_TASK_DATA and CACHELINE_ALIGNED_DATA.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:31 +10:00
Tim Abbott d6cd1f0c38 m68knommu: Make THREAD_SIZE available to assembly files.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:31 +10:00
Tim Abbott f4bed4fb17 m68knommu: Don't hardcode the value of PAGE_SIZE in the linker script.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:31 +10:00
Greg Ungerer 10f204e5ad m68knommu: rename BSS define in linker script
The "BSS" define name now used in asm-generic/vmlinux.lds.h
(introduced in commit ef53dae865)
clashes with the internal "BSS" define in the m68knommu vmlinux.lds.S
linker script. So rename it.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:31 +10:00
Greg Ungerer c23b6538d0 m68knommu: add a task_pt_regs() macro
Add a task_pt_regs() macro as per the CONFIG_UTRACE requirements.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:30 +10:00
Greg Ungerer 193f087d49 m68knommu: define arch_has_single_step() and friends
Towards adding CONFIG_UTRACE support for non-mmu m68k add
arch_has_single_step, and its support functions user_enable_single_step()
and user_disable_single_step().

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:30 +10:00
Lennart Sorensen 588baeac38 m68knommu: add uboot commandline argument passing support
This patch adds m68knommu support for getting the kernel command line
arguments from uboot, including the passing of an initrd image from uboot.

We use this on a 5270/5271 based board, and have used it on the 5271evb
development board.  It is based on a patch found in the linux-2.6-denx
git tree, although that tree seems to have had lots of other changes
since which are not in the main Linus kernel.  I believe this will work
on all coldfires, although other m68knommu might be missing the _init_sp
stuff in head.S as far as I can tell.  I only have the coldfire to
test on.

Signed-off-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:30 +10:00
Steven King b0d22d66fd m68knommu: Coldfire GPIO corrections
Pin 0 of the EPORT is not connected on the 523x, 5271, 5275 and 528x and the
TIMER on the 523x has 8 pins, not 4.

Signed-off-by: Steven King <sfking@fdwdc.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2009-12-04 11:45:30 +10:00
Mark Brown 88d27041cf ARM: S3C6410: Correct names of IISv4 data output pin definitions
The naming of the defines suggests that there are three IISv4 ports
with one data line each when in fact there is a single IISv4 port
with three data lines.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-12-03 21:58:10 +00:00
Ben Dooks 009f742bde ARM: Merge next-s3c64xx-updates
Merge branch 'next-s3c64xx-updates' into for-rmk

Conflicts:

	arch/arm/plat-s3c/dev-hsmmc2.c
	arch/arm/plat-s3c/include/plat/sdhci.h
2009-12-03 21:53:10 +00:00
Ben Dooks f18ea8276b ARM: Merge next-s3c24xx-dev-rtp
Merge branch 'next-s3c24xx-dev-rtp' into for-rmk

Conflicts:

	arch/arm/mach-s3c2440/mach-anubis.c
2009-12-03 21:33:01 +00:00
Ben Dooks 3d4db84cee ARM: Merge next-s3c24xx-simtec
Merge branch 'next-s3c24xx-simtec' into for-rmk
2009-12-03 21:31:20 +00:00
Srinidhi Kasagar 48371cd3f4 ARM: 5845/1: l2x0: check whether l2x0 already enabled
If running in non-secure mode accessing
some registers of l2x0 will fault. So
check if l2x0 is already enabled, if so
do not access those secure registers.

Signed-off-by: srinidhi kasagar <srinidhi.kasagar@stericsson.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-03 19:42:30 +00:00
Leo Chen 1f739d7643 ARM: 5792/1: bcmring: clean up mach/io.h
removed old macro definition for io access, using
the generic macros defined in asm/io.h

Signed-off-by: Leo Hao Chen <leochen@broadcom.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-03 19:42:30 +00:00
Ian Campbell f6eafe3665 xen: call clock resume notifier on all CPUs
tick_resume() is never called on secondary processors. Presumably this
is because they are offlined for suspend on native and so this is
normally taken care of in the CPU onlining path. Under Xen we keep all
CPUs online over a suspend.

This patch papers over the issue for me but I will investigate a more
generic, less hacky, way of doing to the same.

tick_suspend is also only called on the boot CPU which I presume should
be fixed too.

Signed-off-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
2009-12-03 11:14:55 -08:00
Jeremy Fitzhardinge 6aaf5d633b xen: use iret for return from 64b kernel to 32b usermode
If Xen wants to return to a 32b usermode with sysret it must use the
right form.  When using VCGF_in_syscall to trigger this, it looks at
the code segment and does a 32b sysret if it is FLAT_USER_CS32.
However, this is different from __USER32_CS, so it fails to return
properly if we use the normal Linux segment.

So avoid the whole mess by dropping VCGF_in_syscall and simply use
plain iret to return to usermode.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Jan Beulich <jbeulich@novell.com>
Cc: Stable Kernel <stable@kernel.org>
2009-12-03 11:14:54 -08:00
Jeremy Fitzhardinge 499d19b82b xen: register runstate info for boot CPU early
printk timestamping uses sched_clock, which in turn relies on runstate
info under Xen.  So make sure we set it up before any printks can
be called.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
2009-12-03 11:14:53 -08:00
Ian Campbell 028896721a xen: register runstate on secondary CPUs
The commit "xen: re-register runstate area earlier on resume" caused us
to never try and setup the runstate area for secondary CPUs. Ensure that
we do this...

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
2009-12-03 11:14:52 -08:00
Ian Campbell f350c7922f xen: register timer interrupt with IRQF_TIMER
Otherwise the timer is disabled by dpm_suspend_noirq() which in turn prevents
correct operation of stop_machine on multi-processor systems and breaks
suspend.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
2009-12-03 11:14:52 -08:00
Ian Campbell fa24ba62ea xen: correctly restore pfn_to_mfn_list_list after resume
pvops kernels >= 2.6.30 can currently only be saved and restored once. The
second attempt to save results in:

    ERROR Internal error: Frame# in pfn-to-mfn frame list is not in pseudophys
    ERROR Internal error: entry 0: p2m_frame_list[0] is 0xf2c2c2c2, max 0x120000
    ERROR Internal error: Failed to map/save the p2m frame list

I finally narrowed it down to:

    commit cdaead6b4e
        Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
        Date:   Fri Feb 27 15:34:59 2009 -0800

            xen: split construction of p2m mfn tables from registration

            Build the p2m_mfn_list_list early with the rest of the p2m table, but
            register it later when the real shared_info structure is in place.

            Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

The unforeseen side-effect of this change was to cause the mfn list list to not
be rebuilt on resume. Prior to this change it would have been rebuilt via
xen_post_suspend() -> xen_setup_shared_info() -> xen_setup_mfn_list_list().

Fix by explicitly calling xen_build_mfn_list_list() from xen_post_suspend().

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
2009-12-03 11:14:51 -08:00
Jeremy Fitzhardinge 3905bb2aa7 xen: restore runstate_info even if !have_vcpu_info_placement
Even if have_vcpu_info_placement is not set, we still need to set up
the runstate area on each resumed vcpu.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
2009-12-03 11:14:51 -08:00
Ian Campbell be012920ec xen: re-register runstate area earlier on resume.
This is necessary to ensure the runstate area is available to
xen_sched_clock before any calls to printk which will require it in
order to provide a timestamp.

I chose to pull the xen_setup_runstate_info out of xen_time_init into
the caller in order to maintain parity with calling
xen_setup_runstate_info separately from calling xen_time_resume.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
2009-12-03 11:14:50 -08:00
Ingo Molnar d103d01e4b Merge branch 'perf/probes' into perf/core
Merge reason: add these fixes to 'perf probe'.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-03 20:11:38 +01:00
Ingo Molnar 26fb20d008 Merge branch 'perf/mce' into perf/core
Merge reason: It's ready for v2.6.33.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-03 20:11:06 +01:00
Mikael Pettersson 7d1849aff6 x86, apic: Enable lapic nmi watchdog on AMD Family 11h
The x86 lapic nmi watchdog does not recognize AMD Family 11h,
resulting in:

  NMI watchdog: CPU not supported

As far as I can see from available documentation (the BKDM),
family 11h looks identical to family 10h as far as the PMU
is concerned.

Extending the check to accept family 11h results in:

  Testing NMI watchdog ... OK.

I've been running with this change on a Turion X2 Ultra ZM-82
laptop for a couple of weeks now without problems.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: <stable@kernel.org>
LKML-Reference: <19223.53436.931768.278021@pilspetsen.it.uu.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-03 16:25:15 +01:00
Jens Axboe 220d0b1dbf Merge branch 'master' into for-2.6.33 2009-12-03 13:49:39 +01:00
Xiaotian Feng 57fea8f7ab x86/reboot: Add pci_dev_put in reboot_fixup_32.c for consistency
pci_get_device will increase the ref count of found device.
Although we're going to reset soon, we should use pci_dev_put
to decrease the ref count for consistency.

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1259838400-23833-1-git-send-email-dfeng@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-03 12:17:55 +01:00
Darrick J. Wong 4528752f49 x86, Calgary IOMMU quirk: Find nearest matching Calgary while walking up the PCI tree
On a multi-node x3950M2 system, there's a slight oddity in the
PCI device tree for all secondary nodes:

 30:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
  \-33:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01)
     \-34:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)

...as compared to the primary node:

 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
  \-01:00.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)
 03:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01)
  \-04:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)

In both nodes, the LSI RAID controller hangs off a CalIOC2
device, but on the secondary nodes, the BIOS hides the VGA
device and substitutes the device tree ending with the disk
controller.

It would seem that Calgary devices don't necessarily appear at
the top of the PCI tree, which means that the current code to
find the Calgary IOMMU that goes with a particular device is
buggy.

Rather than walk all the way to the top of the PCI
device tree and try to match bus number with Calgary descriptor,
the code needs to examine each parent of the particular device;
if it encounters a Calgary with a matching bus number, simply
use that.

Otherwise, we BUG() when the bus number of the Calgary doesn't
match the bus number of whatever's at the top of the device tree.

Extra note: This patch appears to work correctly for the x3950
that came before the x3950 M2.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Jon D. Mason <jdmason@kudzu.us>
Cc: Corinna Schultz <coschult@us.ibm.com>
Cc: <stable@kernel.org>
LKML-Reference: <20091202230556.GG10295@tux1.beaverton.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-03 11:44:05 +01:00
Dmitry Torokhov 467832032c Merge commit 'v2.6.32' into next 2009-12-02 23:38:13 -08:00
Avi Kivity d5696725b2 KVM: VMX: Fix comparison of guest efer with stale host value
update_transition_efer() masks out some efer bits when deciding whether
to switch the msr during guest entry; for example, NX is emulated using the
mmu so we don't need to disable it, and LMA/LME are handled by the hardware.

However, with shared msrs, the comparison is made against a stale value;
at the time of the guest switch we may be running with another guest's efer.

Fix by deferring the mask/compare to the actual point of guest entry.

Noted by Marcelo.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:34:20 +02:00
Carsten Otte f50146bd7b KVM: s390: Fix prefix register checking in arch/s390/kvm/sigp.c
This patch corrects the checking of the new address for the prefix register.
On s390, the prefix register is used to address the cpu's lowcore (address
0...8k). This check is supposed to verify that the memory is readable and
present.
copy_from_guest is a helper function, that can be used to read from guest
memory. It applies prefixing, adds the start address of the guest memory in
user, and then calls copy_from_user. Previous code was obviously broken for
two reasons:
- prefixing should not be applied here. The current prefix register is
  going to be updated soon, and the address we're looking for will be
  0..8k after we've updated the register
- we're adding the guest origin (gmsor) twice: once in subject code
  and once in copy_from_guest

With kuli, we did not hit this problem because (a) we were lucky with
previous prefix register content, and (b) our guest memory was mmaped
very low into user address space.

Cc: stable@kernel.org
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reported-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:26 +02:00
Avi Kivity 3548bab501 KVM: Drop user return notifier when disabling virtualization on a cpu
This way, we don't leave a dangling notifier on cpu hotunplug or module
unload.  In particular, module unload leaves the notifier pointing into
freed memory.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:26 +02:00
Sheng Yang 046d87103a KVM: VMX: Disable unrestricted guest when EPT disabled
Otherwise would cause VMEntry failure when using ept=0 on unrestricted guest
supported processors.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:25 +02:00
Avi Kivity eb3c79e64a KVM: x86 emulator: limit instructions to 15 bytes
While we are never normally passed an instruction that exceeds 15 bytes,
smp games can cause us to attempt to interpret one, which will cause
large latencies in non-preempt hosts.

Cc: stable@kernel.org
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:25 +02:00
Carsten Otte d7b0b5eb30 KVM: s390: Make psw available on all exits, not just a subset
This patch moves s390 processor status word into the base kvm_run
struct and keeps it up-to date on all userspace exits.

The userspace ABI is broken by this, however there are no applications
in the wild using this.  A capability check is provided so users can
verify the updated API exists.

Cc: stable@kernel.org
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:25 +02:00
Jan Kiszka 3cfc3092f4 KVM: x86: Add KVM_GET/SET_VCPU_EVENTS
This new IOCTL exports all yet user-invisible states related to
exceptions, interrupts, and NMIs. Together with appropriate user space
changes, this fixes sporadic problems of vmsave/restore, live migration
and system reset.

[avi: future-proof abi by adding a flags field]

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:25 +02:00
Avi Kivity 65ac726404 KVM: VMX: Report unexpected simultaneous exceptions as internal errors
These happen when we trap an exception when another exception is being
delivered; we only expect these with MCEs and page faults.  If something
unexpected happens, things probably went south and we're better off reporting
an internal error and freezing.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:24 +02:00
Avi Kivity a9c7399d6c KVM: Allow internal errors reported to userspace to carry extra data
Usually userspace will freeze the guest so we can inspect it, but some
internal state is not available.  Add extra data to internal error
reporting so we can expose it to the debugger.  Extra data is specific
to the suberror.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:24 +02:00
Jan Kiszka 4f926bf291 KVM: x86: Polish exception injection via KVM_SET_GUEST_DEBUG
Decouple KVM_GUESTDBG_INJECT_DB and KVM_GUESTDBG_INJECT_BP from
KVM_GUESTDBG_ENABLE, their are actually orthogonal. At this chance,
avoid triggering the WARN_ON in kvm_queue_exception if there is already
an exception pending and reject such invalid requests.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:24 +02:00
Marcelo Tosatti 2204ae3c96 KVM: x86: disallow KVM_{SET,GET}_LAPIC without allocated in-kernel lapic
Otherwise kvm might attempt to dereference a NULL pointer.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:23 +02:00
Marcelo Tosatti 3ddea128ad KVM: x86: disallow multiple KVM_CREATE_IRQCHIP
Otherwise kvm will leak memory on multiple KVM_CREATE_IRQCHIP.
Also serialize multiple accesses with kvm->lock.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:23 +02:00
Avi Kivity 92c0d90015 KVM: VMX: Remove vmx->msr_offset_efer
This variable is used to communicate between a caller and a callee; switch
to a function argument instead.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:23 +02:00
Marcelo Tosatti 5f5c35aad5 KVM: MMU: update invlpg handler comment
Large page translations are always synchronized (either in level 3
or level 2), so its not necessary to properly deal with them
in the invlpg handler.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:23 +02:00
Marcelo Tosatti 7c93be44a4 KVM: VMX: move CR3/PDPTR update to vmx_set_cr3
GUEST_CR3 is updated via kvm_set_cr3 whenever CR3 is modified from
outside guest context. Similarly pdptrs are updated via load_pdptrs.

Let kvm_set_cr3 perform the update, removing it from the vcpu_run
fast path.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Acked-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:22 +02:00
Gleb Natapov 1655e3a3dc KVM: remove duplicated task_switch check
Probably introduced by a bad merge.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:22 +02:00
Hollis Blanchard c0a187e12d KVM: powerpc: Fix BUILD_BUG_ON condition
The old BUILD_BUG_ON implementation didn't work with __builtin_constant_p().
Fixing that revealed this test had been inverted for a long time without
anybody noticing...

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:22 +02:00
Avi Kivity 26bb0981b3 KVM: VMX: Use shared msr infrastructure
Instead of reloading syscall MSRs on every preemption, use the new shared
msr infrastructure to reload them at the last possible minute (just before
exit to userspace).

Improves vcpu/idle/vcpu switches by about 2000 cycles (when EFER needs to be
reloaded as well).

[jan: fix slot index missing indirection]

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:22 +02:00
Avi Kivity 18863bdd60 KVM: x86 shared msr infrastructure
The various syscall-related MSRs are fairly expensive to switch.  Currently
we switch them on every vcpu preemption, which is far too often:

- if we're switching to a kernel thread (idle task, threaded interrupt,
  kernel-mode virtio server (vhost-net), for example) and back, then
  there's no need to switch those MSRs since kernel threasd won't
  be exiting to userspace.

- if we're switching to another guest running an identical OS, most likely
  those MSRs will have the same value, so there's little point in reloading
  them.

- if we're running the same OS on the guest and host, the MSRs will have
  identical values and reloading is unnecessary.

This patch uses the new user return notifiers to implement last-minute
switching, and checks the msr values to avoid unnecessary reloading.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:21 +02:00
Avi Kivity 44ea2b1758 KVM: VMX: Move MSR_KERNEL_GS_BASE out of the vmx autoload msr area
Currently MSR_KERNEL_GS_BASE is saved and restored as part of the
guest/host msr reloading.  Since we wish to lazy-restore all the other
msrs, save and reload MSR_KERNEL_GS_BASE explicitly instead of using
the common code.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:21 +02:00
Eduardo Habkost 3ce672d484 KVM: SVM: init_vmcb(): remove redundant save->cr0 initialization
The svm_set_cr0() call will initialize save->cr0 properly even when npt is
enabled, clearing the NW and CD bits as expected, so we don't need to
initialize it manually for npt_enabled anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:21 +02:00
Eduardo Habkost 18fa000ae4 KVM: SVM: Reset cr0 properly on vcpu reset
svm_vcpu_reset() was not properly resetting the contents of the guest-visible
cr0 register, causing the following issue:
https://bugzilla.redhat.com/show_bug.cgi?id=525699

Without resetting cr0 properly, the vcpu was running the SIPI bootstrap routine
with paging enabled, making the vcpu get a pagefault exception while trying to
run it.

Instead of setting vmcb->save.cr0 directly, the new code just resets
kvm->arch.cr0 and calls kvm_set_cr0(). The bits that were set/cleared on
vmcb->save.cr0 (PG, WP, !CD, !NW) will be set properly by svm_set_cr0().

kvm_set_cr0() is used instead of calling svm_set_cr0() directly to make sure
kvm_mmu_reset_context() is called to reset the mmu to nonpaging mode.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:21 +02:00
Eduardo Habkost fa40052ca0 KVM: VMX: Use macros instead of hex value on cr0 initialization
This should have no effect, it is just to make the code clearer.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:21 +02:00
Glauber Costa afbcf7ab8d KVM: allow userspace to adjust kvmclock offset
When we migrate a kvm guest that uses pvclock between two hosts, we may
suffer a large skew. This is because there can be significant differences
between the monotonic clock of the hosts involved. When a new host with
a much larger monotonic time starts running the guest, the view of time
will be significantly impacted.

Situation is much worse when we do the opposite, and migrate to a host with
a smaller monotonic clock.

This proposed ioctl will allow userspace to inform us what is the monotonic
clock value in the source host, so we can keep the time skew short, and
more importantly, never goes backwards. Userspace may also need to trigger
the current data, since from the first migration onwards, it won't be
reflected by a simple call to clock_gettime() anymore.

[marcelo: future-proof abi with a flags field]
[jan: fix KVM_GET_CLOCK by clearing flags field instead of checking it]

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:19 +02:00
Jan Kiszka 6be7d3062b KVM: SVM: Cleanup NMI singlestep
Push the NMI-related singlestep variable into vcpu_svm. It's dealing
with an AMD-specific deficit, nothing generic for x86.

Acked-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

 arch/x86/include/asm/kvm_host.h |    1 -
 arch/x86/kvm/svm.c              |   12 +++++++-----
 2 files changed, 7 insertions(+), 6 deletions(-)
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:19 +02:00
Jan Kiszka 94fe45da48 KVM: x86: Fix guest single-stepping while interruptible
Commit 705c5323 opened the doors of hell by unconditionally injecting
single-step flags as long as guest_debug signaled this. This doesn't
work when the guest branches into some interrupt or exception handler
and triggers a vmexit with flag reloading.

Fix it by saving cs:rip when user space requests single-stepping and
restricting the trace flag injection to this guest code position.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:19 +02:00
Ed Swierk ffde22ac53 KVM: Xen PV-on-HVM guest support
Support for Xen PV-on-HVM guests can be implemented almost entirely in
userspace, except for handling one annoying MSR that maps a Xen
hypercall blob into guest address space.

A generic mechanism to delegate MSR writes to userspace seems overkill
and risks encouraging similar MSR abuse in the future.  Thus this patch
adds special support for the Xen HVM MSR.

I implemented a new ioctl, KVM_XEN_HVM_CONFIG, that lets userspace tell
KVM which MSR the guest will write to, as well as the starting address
and size of the hypercall blobs (one each for 32-bit and 64-bit) that
userspace has loaded from files.  When the guest writes to the MSR, KVM
copies one page of the blob from userspace to the guest.

I've tested this patch with a hacked-up version of Gerd's userspace
code, booting a number of guests (CentOS 5.3 i386 and x86_64, and
FreeBSD 8.0-RC1 amd64) and exercising PV network and block devices.

[jan: fix i386 build warning]
[avi: future proof abi with a flags field]

Signed-off-by: Ed Swierk <eswierk@aristanetworks.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:18 +02:00
Jan Kiszka 94c30d9ca6 KVM: x86: Drop unneeded CONFIG_HAS_IOMEM check
This (broken) check dates back to the days when this code was shared
across architectures. x86 has IOMEM, so drop it.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:18 +02:00
Marcelo Tosatti 9fb41ba896 KVM: VMX: fix handle_pause declaration
There's no kvm_run argument anymore.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:18 +02:00
Zachary Amsden 6b7d7e762b KVM: x86: Harden against cpufreq
If cpufreq can't determine the CPU khz, or cpufreq is not compiled in,
we should fallback to the measured TSC khz.

Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:18 +02:00
Mark Langsdorf 565d0998ec KVM: SVM: Support Pause Filter in AMD processors
New AMD processors (Family 0x10 models 8+) support the Pause
Filter Feature.  This feature creates a new field in the VMCB
called Pause Filter Count.  If Pause Filter Count is greater
than 0 and intercepting PAUSEs is enabled, the processor will
increment an internal counter when a PAUSE instruction occurs
instead of intercepting.  When the internal counter reaches the
Pause Filter Count value, a PAUSE intercept will occur.

This feature can be used to detect contended spinlocks,
especially when the lock holding VCPU is not scheduled.
Rescheduling another VCPU prevents the VCPU seeking the
lock from wasting its quantum by spinning idly.

Experimental results show that most spinlocks are held
for less than 1000 PAUSE cycles or more than a few
thousand.  Default the Pause Filter Counter to 3000 to
detect the contended spinlocks.

Processor support for this feature is indicated by a CPUID
bit.

On a 24 core system running 4 guests each with 16 VCPUs,
this patch improved overall performance of each guest's
32 job kernbench by approximately 3-5% when combined
with a scheduler algorithm thati caused the VCPU to
sleep for a brief period. Further performance improvement
may be possible with a more sophisticated yield algorithm.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:17 +02:00
Zhai, Edwin 4b8d54f972 KVM: VMX: Add support for Pause-Loop Exiting
New NHM processors will support Pause-Loop Exiting by adding 2 VM-execution
control fields:
PLE_Gap    - upper bound on the amount of time between two successive
             executions of PAUSE in a loop.
PLE_Window - upper bound on the amount of time a guest is allowed to execute in
             a PAUSE loop

If the time, between this execution of PAUSE and previous one, exceeds the
PLE_Gap, processor consider this PAUSE belongs to a new loop.
Otherwise, processor determins the the total execution time of this loop(since
1st PAUSE in this loop), and triggers a VM exit if total time exceeds the
PLE_Window.
* Refer SDM volume 3b section 21.6.13 & 22.1.3.

Pause-Loop Exiting can be used to detect Lock-Holder Preemption, where one VP
is sched-out after hold a spinlock, then other VPs for same lock are sched-in
to waste the CPU time.

Our tests indicate that most spinlocks are held for less than 212 cycles.
Performance tests show that with 2X LP over-commitment we can get +2% perf
improvement for kernel build(Even more perf gain with more LPs).

Signed-off-by: Zhai Edwin <edwin.zhai@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:17 +02:00
Joerg Roedel d36f19e9ec KVM: SVM: Remove nsvm_printk debugging code
With all important informations now delivered through
tracepoints we can savely remove the nsvm_printk debugging
code for nested svm.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:17 +02:00
Joerg Roedel 532a46b989 KVM: SVM: Add tracepoint for skinit instruction
This patch adds a tracepoint for the event that the guest
executed the SKINIT instruction. This information is
important because SKINIT is an SVM extenstion not yet
implemented by nested SVM and we may need this information
for debugging hypervisors that do not yet run on nested SVM.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:16 +02:00
Joerg Roedel ec1ff79084 KVM: SVM: Add tracepoint for invlpga instruction
This patch adds a tracepoint for the event that the guest
executed the INVLPGA instruction.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:16 +02:00
Joerg Roedel 236649de33 KVM: SVM: Add tracepoint for #vmexit because intr pending
This patch adds a special tracepoint for the event that a
nested #vmexit is injected because kvm wants to inject an
interrupt into the guest.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:16 +02:00
Joerg Roedel 17897f3668 KVM: SVM: Add tracepoint for injected #vmexit
This patch adds a tracepoint for a nested #vmexit that gets
re-injected to the guest.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:15 +02:00
Joerg Roedel d8cabddf7e KVM: SVM: Add tracepoint for nested #vmexit
This patch adds a tracepoint for every #vmexit we get from a
nested guest.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:15 +02:00
Joerg Roedel 0ac406de8f KVM: SVM: Add tracepoint for nested vmrun
This patch adds a dedicated kvm tracepoint for a nested
vmrun.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:15 +02:00
Joerg Roedel cd3ff653ae KVM: SVM: Move INTR vmexit out of atomic code
The nested SVM code emulates a #vmexit caused by a request
to open the irq window right in the request function. This
is a bug because the request function runs with preemption
and interrupts disabled but the #vmexit emulation might
sleep. This can cause a schedule()-while-atomic bug and is
fixed with this patch.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:15 +02:00
Alexander Graf 8d23c46624 KVM: SVM: Notify nested hypervisor of lost event injections
If event_inj is valid on a #vmexit the host CPU would write
the contents to exit_int_info, so the hypervisor knows that
the event wasn't injected.

We don't do this in nested SVM by now which is a bug and
fixed by this patch.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:14 +02:00
Glauber Costa e3267cbbbf KVM: x86: include pvclock MSRs in msrs_to_save
For a while now, we are issuing a rdmsr instruction to find out which
msrs in our save list are really supported by the underlying machine.
However, it fails to account for kvm-specific msrs, such as the pvclock
ones.

This patch moves then to the beginning of the list, and skip testing them.

Cc: stable@kernel.org
Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:14 +02:00
Jan Kiszka 91586a3b7d KVM: x86: Rework guest single-step flag injection and filtering
Push TF and RF injection and filtering on guest single-stepping into the
vender get/set_rflags callbacks. This makes the whole mechanism more
robust wrt user space IOCTL order and instruction emulations.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:14 +02:00
Marcelo Tosatti a68a6a7282 KVM: x86: disable paravirt mmu reporting
Disable paravirt MMU capability reporting, so that new (or rebooted)
guests switch to native operation.

Paravirt MMU is a burden to maintain and does not bring significant
advantages compared to shadow anymore.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:14 +02:00
Jan Kiszka 355be0b930 KVM: x86: Refactor guest debug IOCTL handling
Much of so far vendor-specific code for setting up guest debug can
actually be handled by the generic code. This also fixes a minor deficit
in the SVM part /wrt processing KVM_GUESTDBG_ENABLE.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:14 +02:00
Juan Quintela 201d945bcf KVM: remove pre_task_link setting in save_state_to_tss16
Now, also remove pre_task_link setting in save_state_to_tss16.

  commit b237ac37a1
  Author: Gleb Natapov <gleb@redhat.com>
  Date:   Mon Mar 30 16:03:24 2009 +0300

    KVM: Fix task switch back link handling.

CC: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:13 +02:00
Zachary Amsden 3230bb4707 KVM: Fix hotplug of CPUs
Both VMX and SVM require per-cpu memory allocation, which is done at module
init time, for only online cpus.

Backend was not allocating enough structure for all possible CPUs, so
new CPUs coming online could not be hardware enabled.

Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:13 +02:00
Zachary Amsden e6732a5af9 KVM: Fix printk name error in svm.c
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:13 +02:00
Zachary Amsden 0cca790753 KVM: Kill the confusing tsc_ref_khz and ref_freq variables
They are globals, not clearly protected by any ordering or locking, and
vulnerable to various startup races.

Instead, for variable TSC machines, register the cpufreq notifier and get
the TSC frequency directly from the cpufreq machinery.  Not only is it
always right, it is also perfectly accurate, as no error prone measurement
is required.

On such machines, when a new CPU online is brought online, it isn't clear what
frequency it will start with, and it may not correspond to the reference, thus
in hardware_enable we clear the cpu_tsc_khz variable to zero and make sure
it is set before running on a VCPU.

Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:12 +02:00
Zachary Amsden b820cc0ca2 KVM: Separate timer intialization into an indepedent function
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:12 +02:00
Joerg Roedel e935d48e1b KVM: SVM: Remove remaining occurences of rdtscll
This patch replaces them with native_read_tsc() which can
also be used in expressions and saves a variable on the
stack in this case.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:12 +02:00
Joerg Roedel 33527ad7e1 KVM: SVM: don't copy exit_int_info on nested vmrun
The exit_int_info field is only written by the hardware and
never read. So it does not need to be copied on a vmrun
emulation.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:11 +02:00
Joerg Roedel 7fcdb5103d KVM: SVM: reorganize svm_interrupt_allowed
This patch reorganizes the logic in svm_interrupt_allowed to
make it better to read. This is important because the logic
is a lot more complicated with Nested SVM.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:11 +02:00
Huang Weiyi bfc33beaed KVM: remove duplicated #include
Remove duplicated #include('s) in
  arch/x86/kvm/lapic.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:10 +02:00
Alexander Graf 10474ae894 KVM: Activate Virtualization On Demand
X86 CPUs need to have some magic happening to enable the virtualization
extensions on them. This magic can result in unpleasant results for
users, like blocking other VMMs from working (vmx) or using invalid TLB
entries (svm).

Currently KVM activates virtualization when the respective kernel module
is loaded. This blocks us from autoloading KVM modules without breaking
other VMMs.

To circumvent this problem at least a bit, this patch introduces on
demand activation of virtualization. This means, that instead
virtualization is enabled on creation of the first virtual machine
and disabled on destruction of the last one.

So using this, KVM can be easily autoloaded, while keeping other
hypervisors usable.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:10 +02:00
Marcelo Tosatti e8b3433a5c KVM: SVM: remove needless mmap_sem acquision from nested_svm_map
nested_svm_map unnecessarily takes mmap_sem around gfn_to_page, since
gfn_to_page / get_user_pages are responsible for it.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:10 +02:00
Mohammed Gamal 80ced186d1 KVM: VMX: Enhance invalid guest state emulation
- Change returned handle_invalid_guest_state() to return relevant exit codes
- Move triggering the emulation from vmx_vcpu_run() to vmx_handle_exit()
- Return to userspace instead of repeatedly trying to emulate instructions that have already failed

Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:09 +02:00
Mohammed Gamal abcf14b560 KVM: x86 emulator: Add pusha and popa instructions
This adds pusha and popa instructions (opcodes 0x60-0x61), this enables booting
MINIX with invalid guest state emulation on.

[marcelo: remove unused variable]

Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:09 +02:00
Mohammed Gamal 94677e61fd KVM: x86 emulator: Add missing decoder flags for 'or' instructions
Add missing decoder flags for or instructions (0xc-0xd).

Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:09 +02:00
Avi Kivity bfd99ff5d4 KVM: Move assigned device code to own file
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:09 +02:00
Avi Kivity 367e1319b2 KVM: Return -ENOTTY on unrecognized ioctls
Not the incorrect -EINVAL.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:08 +02:00
Gleb Natapov 680b3648ba KVM: Drop kvm->irq_lock lock from irq injection path
The only thing it protects now is interrupt injection into lapic and
this can work lockless. Even now with kvm->irq_lock in place access
to lapic is not entirely serialized since vcpu access doesn't take
kvm->irq_lock.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:08 +02:00
Gleb Natapov eba0226bdf KVM: Move IO APIC to its own lock
The allows removal of irq_lock from the injection path.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:08 +02:00
Gleb Natapov 136bdfeee7 KVM: Move irq ack notifier list to arch independent code
Mask irq notifier list is already there.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:07 +02:00
Gleb Natapov 3e71f88bc9 KVM: Maintain back mapping from irqchip/pin to gsi
Maintain back mapping from irqchip/pin to gsi to speedup
interrupt acknowledgment notifications.

[avi: build fix on non-x86/ia64]

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:07 +02:00
Gleb Natapov 1a6e4a8c27 KVM: Move irq sharing information to irqchip level
This removes assumptions that max GSIs is smaller than number of pins.
Sharing is tracked on pin level not GSI level.

[avi: no PIC on ia64]

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:06 +02:00
Gleb Natapov 79c727d437 KVM: Call pic_clear_isr() on pic reset to reuse logic there
Also move call of ack notifiers after pic state change.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:06 +02:00
Avi Kivity 851ba6922a KVM: Don't pass kvm_run arguments
They're just copies of vcpu->run, which is readily accessible.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:06 +02:00
Mohammed Gamal d8769fedd4 KVM: x86 emulator: Introduce No64 decode option
Introduces a new decode option "No64", which is used for instructions that are
invalid in long mode.

Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:05 +02:00
Mohammed Gamal 0934ac9d13 KVM: x86 emulator: Add 'push/pop sreg' instructions
[avi: avoid buffer overflow]

Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:05 +02:00
Avi Kivity 58988b07cf Merge remote branch 'tip/x86/entry' into kvm-updates/2.6.33
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:30:06 +02:00
Kristoffer Glembo c803ba9017 sparc,leon: init_leon srmmu cleanup
Removed unused assignment and capitalized srmmu name for sparc_leon

Signed-off-by: Kristoffer Glembo <kristoffer@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02 22:28:50 -08:00
Kristoffer Glembo fdd98ac96e sparc32: Remove early interrupt enable.
Enabling interrupts at this points causes the warning
"start_kernel(): bug: interrupts were enabled early"
to be printed in start_kernel().

Signed-off-by: Kristoffer Glembo <kristoffer@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02 22:28:49 -08:00
Kristoffer Glembo 3560f788fe sparc, leon: Added Aeroflex Gaisler entry in manufacturer_info structure
Signed-off-by: Kristoffer Glembo <kristoffer@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-02 22:28:49 -08:00
Hidetoshi Seto fe5ed91ddc x86, mce: don't restart timer if disabled
Even it is in error path unlikely taken, add_timer_on() at
CPU_DOWN_FAILED* needs to be skipped if mce_timer is disabled.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: <stable@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-02 21:27:32 -08:00
Olof Johansson 9a01609e18 arm: omap: Add omap3_defconfig
Having one combined defconfig that is the superset of the individual
defconfigs for OMAP3 platforms is useful for easily finding build
errors. Not to mention convenient as a base if you want to boot several
platforms with a single kernel image.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2009-12-02 16:52:21 -08:00
Florian Fainelli 4e7c81af3f MIPS: RB532: Fix devices.c compilation.
We should now use dev_set_drvdata to set the driver driver_data field.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Patchwork: http://patchwork.linux-mips.org/patch/747/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-12-02 18:09:51 +00:00
Ralf Baechle a91be9ee69 MIPS: Fix MIPS I build.
Broken by d63c63e889bbeeaa461a8addf1245f89f3ce4ece (lmo) rsp.
f1e39a4a61 (kernel.org).

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/746/
2009-12-02 18:09:51 +00:00
Sascha Hauer 4c8b581dd2 i.MX27 audmux: Fix register offsets
We have two holes in the register space. The driver did not
handle this. Fix it.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2009-12-02 12:17:16 +01:00
Alan Carvalho de Assis 9e3e7afe9b mx27: mxt_td60: Add support to SD/MMC
This patch configures iomux and i2c io expander in order to add
support to SD/MMC cards on i-MXT TD60.

Signed-off-by: Alan Carvalho de Assis <acassis@gmail.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
2009-12-02 12:06:14 +01:00
Jan Beulich 99063c0bce x86/alternatives: No need for alternatives-asm.h to re-invent stuff already in asm.h
This at once also gets the alignment specification right for
x86-64.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4B0FF8F80200007800022708@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02 11:39:45 +01:00
Jan Beulich 01be50a308 x86/alternatives: Check replacementlen <= instrlen at build time
Having run into the run-(boot-)time check a couple of times lately,
I finally took time to find a build-time check so that one doesn't
need to analyze the register/stack dump and resolve this (through
manual lookup in vmlinux) to the offending construct.

The assembler will emit a message like "Error: value of <num> too
large for field of 1 bytes at <offset>", which while not pointing
out the source location still makes analysis quite a bit easier.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4B0FF8AA0200007800022703@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02 11:39:45 +01:00
Alexander Shishkin 183bd50f4f ARM: 5843/1: OMAP3: add AMBA devices for ETM and ETB
This enables on-chip tracing components found in omap3xxx.

Signed-off-by: Alexander Shishkin <virtuoso@slind.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-02 10:25:23 +00:00
Alexander Shishkin c5d6c7708c ARM: 5841/1: a driver for on-chip ETM and ETB
This driver implements support for on-chip Embedded Tracing Macrocell and
Embedded Trace Buffer. It allows to trigger tracing of kernel execution flow
and exporting trace output to userspace via character device and a sysrq
combo.

Trace output can then be decoded by a fairly simple open source tool [1]
which is already sufficient to get the idea of what the kernel is doing.

[1]: http://github.com/virtuoso/etm2human

Signed-off-by: Alexander Shishkin <virtuoso@slind.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-02 10:25:22 +00:00
Masami Hiramatsu e859cf8656 x86: Fix comments of register/stack access functions
Fix typos and some redundant comments of register/stack access
functions in asm/ptrace.h.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Wenji Huang <wenji.huang@oracle.com>
Cc: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com>
LKML-Reference: <20091201000222.7669.7477.stgit@harusame>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Suggested-by: Wenji Huang <wenji.huang@oracle.com>
2009-12-02 10:22:22 +01:00
Suresh Siddha 6d20792e85 x86: Remove unnecessary mdelay() from cpu_disable_common()
fixup_irqs() already has a mdelay(). Remove the extra and
unnecessary mdelay() from cpu_disable_common().

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: ebiederm@xmission.com
Cc: garyhade@us.ibm.com
LKML-Reference: <20091201233335.232177348@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02 10:11:02 +01:00
Suresh Siddha 1c83995b6c x86, ioapic: Document another case when level irq is seen as an edge
In the case when cpu goes offline, fixup_irqs() will forward any
unhandled interrupt on the offlined cpu to the new cpu
destination that is handling the corresponding interrupt. This
interrupt forwarding is done via IPI's. Hence, in this case also
level-triggered io-apic interrupt will be seen as an edge
interrupt in the cpu's APIC IRR.

Document this scenario in the code which handles this case by doing
an explicit EOI to the io-apic to clear remote IRR of the io-apic RTE.

Requested-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: ebiederm@xmission.com
Cc: garyhade@us.ibm.com
LKML-Reference: <20091201233335.143970505@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02 10:11:01 +01:00
Suresh Siddha c29d9db338 x86, ioapic: Fix the EOI register detection mechanism
Maciej W. Rozycki reported:

> 82093AA I/O APIC has its version set to 0x11 and it
> does not support the EOI register.  Similarly I/O APICs
> integrated into the 82379AB south bridge and the 82374EB/SB
> EISA component.

IO-APIC versions below 0x20 don't support EOI register.

Some of the Intel ICH Specs (ICH2 to ICH5) documents the io-apic
version as 0x2. This is an error with documentation and these
ICH chips use io-apic's of version 0x20 and indeed has a working
EOI register for the io-apic.

Fix the EOI register detection mechanism to check for version
0x20 and beyond.

And also, a platform can potentially  have io-apic's with
different versions. Make the EOI register check per io-apic.

Reported-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: ebiederm@xmission.com
Cc: garyhade@us.ibm.com
LKML-Reference: <20091201233335.065361533@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02 10:11:01 +01:00
Maciej W. Rozycki ca64c47cec x86, io-apic: Move the effort of clearing remoteIRR explicitly before migrating the irq
When the level-triggered interrupt is seen as an edge interrupt,
we try to clear the remoteIRR explicitly (using either an
io-apic eoi register when present or through the idea of
changing trigger mode of the io-apic RTE to edge and then back
to level). But this explicit try also needs to happen before we
try to migrate the irq. Otherwise irq migration attempt will
fail anyhow, as it postpones the irq migration to a later
attempt when it sees the remoteIRR in the io-apic RTE still set.

Signed-off-by: "Maciej W. Rozycki" <macro@linux-mips.org>
Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: ebiederm@xmission.com
Cc: garyhade@us.ibm.com
LKML-Reference: <20091201233334.975416130@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02 10:11:00 +01:00
Frederic Weisbecker 1cedae7290 hw-breakpoints: Keep track of user disabled breakpoints
When we disable a breakpoint through dr7, we unregister it right
away, making us lose track of its corresponding address
register value.

It means that the following sequence would be unsupported:

 - set address in dr0
 - enable it through dr7
 - disable it through dr7
 - enable it through dr7

because we lost the address register value when we disabled the
breakpoint.

Don't unregister the disabled breakpoints but rather disable
them.

Reported-by: "K.Prasad" <prasad@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1259735536-9236-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02 09:59:03 +01:00
David S. Miller ff9c38bba3 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/mac80211/ht.c
2009-12-01 22:13:38 -08:00
Arnaldo Carvalho de Melo 5a5b6f6f62 MIPS: Wire up recvmmsg syscall
Reported-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01 16:15:49 -08:00
wanzongshun 045868df2c ARM: 5844/1: rename w90p910_defconfig to n uc910_defconfig
rename w90p910_defconfig

Signed-off-by: Wan ZongShun <mcuos.com@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-01 23:08:31 +00:00
wanzongshun 39986ca6bd ARM: 5842/1: add spi resource support for nuc900
add spi resource support

Signed-off-by: Wan ZongShun <mcuos.com@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-01 23:08:31 +00:00
wanzongshun 6aeb4e4a9d ARM: 5839/1: add nuc960_defconfig
add nuc960_defconfig

Signed-off-by: Wan ZongShun <mcuos.com@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-01 23:08:30 +00:00
wanzongshun 9b5b495281 ARM: 5838/1: add nuc950_defconfig
add nuc950_defconfig

Signed-off-by: Wan ZongShun <mcuos.com@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-01 23:08:30 +00:00
Russell King d7931d9f7a Merge branch 'for-rmk' of git://git.marvell.com/orion into devel-stable 2009-12-01 18:22:54 +00:00
Russell King 421fe93cc4 ARM: ZERO_PAGE: Avoid flush_dcache_page() for zero page
The zero page is read-only, and has its cache state cleared during
boot.  No further maintanence for this page is required.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-01 18:20:07 +00:00
Russell King b7dc0b2cfc ARM: Avoid evaluating page_address() multiple times
page_address() is a function call rather than a macro, and so:

	if (page_address(page))
		do_something(page_address(page));

results in two calls to this function.  This is unnecessary; remove
the duplication.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-01 18:20:07 +00:00
Russell King 2f0b192633 ARM: Avoid duplicated implementation for VIVT cache flushing
We had two copies of the wrapper code for VIVT cache flushing - one in
asm/cacheflush.h and one in arch/arm/mm/flush.c.  Reduce this down to
one common copy.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-01 18:20:07 +00:00
Linus Torvalds 0d9ccfe1b5 Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  MIPS: Loongson: Switch from flatmem to sparsemem
  MIPS: Loongson: Disallow 4kB pages
  MIPS: Add missing definition for MADV_HWPOISON.
  MIPS: Fix build error if __xchg() is not getting inlined.
  MIPS: IP22/IP28 Disable early printk to fix boot problems on some systems.
2009-12-01 08:26:44 -08:00
Wu Zhangjin f133f22dd6 MIPS: Loongson: Switch from flatmem to sparsemem
With flatmem hibernation for Loongson will fail, and there are also some
other problems such as broken files when using NFS or CIFS / Samba.

The config help of sparsemem says:

"This option provides some potential performance benefits, along with
decreased code complexity."

So to avoid the potential problems of FLATMEM, we disable FLATMEM directly
and use SPARSEMEM instead.

Related email thread:

http://groups.google.com/group/loongson-dev/browse_thread/thread/b6b65890ec2b0f24/feb43e5aa7f55d9b?show_docid=feb43e5aa7f55d9b

Reported-by: Tatu Kilappa <tatu.kilappa@gmail.com>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Patchwork: http://patchwork.linux-mips.org/patch/737/
Cc: linux-mips@linux-mips.org
Cc: zhangfx@lemote.com
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-12-01 16:21:26 +00:00
Wu Zhangjin 315fe625f8 MIPS: Loongson: Disallow 4kB pages
Currently, with PAGE_SIZE_4KB, the kernel for loongson will hang on:

Kernel panic - not syncing: Attempted to kill init!

The possible reason is the cache aliases problem:

Loongson 2F has 64kb, 4 way L1 Cache, the way size is 16kb, which is bigger
then 4kb. so, If using 4kb page size, there is cache aliases problem.
To avoid this kind of problem, extra cache flushing.  The 2nd possible
solution is 16kb page size which avoids cache aliases without the need for
extra cache flushes.  So we disable 4kB pages until the aliasing issue is
solved.

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Patchwork: http://patchwork.linux-mips.org/patch/736/
Cc: linux-mips@linux-mips.org
Cc: zhangfx@lemote.com
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-12-01 16:21:26 +00:00
Ralf Baechle e1eb3a983b MIPS: Add missing definition for MADV_HWPOISON.
Thanks to Joseph S. Myers for reporting this.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: "Joseph S. Myers" <joseph@codesourcery.com>
Patchwork: http://patchwork.linux-mips.org/patch/723/
2009-12-01 16:21:25 +00:00
Ralf Baechle c677189af9 MIPS: Fix build error if __xchg() is not getting inlined.
If __xchg() is not getting inlined the outline version of the function
will have a reference to __xchg_called_with_bad_pointer() which does not
exist remaining.  Fixed by using BUILD_BUG_ON() to check for allowable
operand sizes.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/705/
2009-12-01 16:21:25 +00:00
Martin Michlmayr 2b5e63f6b8 MIPS: IP22/IP28 Disable early printk to fix boot problems on some systems.
Some Debian users have reported that the kernel hangs early during boot on
some IP22 systems.  Thomas Bogendoerfer found that this is due to a "bad
interaction between CONFIG_EARLY_PRINTK and overwritten prom memory during
early boot".  Since there's no fix yet, disable CONFIG_EARLY_PRINTK for now.

Signed-off-by: Martin Michlmayr <tbm@cyrius.com>
Cc: linux-mips@linux-mips.org
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Dmitri Vorobiev <dmitri.vorobiev@gmail.com>
Patchwork: http://patchwork.linux-mips.org/patch/702/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-12-01 16:21:25 +00:00
Linus Torvalds df87f8c06c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
  alpha: Fixup last users of irq_chip->typename
  Alpha: Rearrange thread info flags fixing two regressions
  arch/alpha/kernel: Add kmalloc NULL tests
  arch/alpha/kernel/sys_ruffian.c: Use DIV_ROUND_CLOSEST
2009-12-01 07:36:23 -08:00