linux

q3k/linux

Author	SHA1	Message	Date
Al Viro	a610d6e672	pull clearing RESTORE_SIGMASK into block_sigmask() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:58:49 -04:00
Al Viro	b7f9a11a6c	new helper: sigmask_to_save() replace boilerplate "should we use ->saved_sigmask or ->blocked?" with calls of obvious inlined helper... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:58:48 -04:00
Al Viro	51a7b448d4	new helper: restore_saved_sigmask() first fruits of ..._restore_sigmask() helpers: now we can take boilerplate "signal didn't have a handler, clear RESTORE_SIGMASK and restore the blocked mask from ->saved_mask" into a common helper. Open-coded instances switched... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:58:47 -04:00
Steven Rostedt	5963e317b1	ftrace/x86: Do not change stacks in DEBUG when calling lockdep When both DYNAMIC_FTRACE and LOCKDEP are set, the TRACE_IRQS_ON/OFF will call into the lockdep code. The lockdep code can call lots of functions that may be traced by ftrace. When ftrace is updating its code and hits a breakpoint, the breakpoint handler will call into lockdep. If lockdep happens to call a function that also has a breakpoint attached, it will jump back into the breakpoint handler resetting the stack to the debug stack and corrupt the contents currently on that stack. The 'do_sym' call that calls do_int3() is protected by modifying the IST table to point to a different location if another breakpoint is hit. But the TRACE_IRQS_OFF/ON are outside that protection, and if a breakpoint is hit from those, the stack will get corrupted, and the kernel will crash: [ 1013.243754] BUG: unable to handle kernel NULL pointer dereference at 0000000000000002 [ 1013.272665] IP: [<ffff880145cc0000>] 0xffff880145cbffff [ 1013.285186] PGD 1401b2067 PUD 14324c067 PMD 0 [ 1013.298832] Oops: 0010 [#1] PREEMPT SMP [ 1013.310600] CPU 2 [ 1013.317904] Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr iTCO_wdt i2c_i801 iTCO_vendor_support e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan] [ 1013.401848] [ 1013.407399] Pid: 112, comm: kworker/2:1 Not tainted 3.4.0+ #30 [ 1013.437943] RIP: 8eb8:[<ffff88014630a000>] [<ffff88014630a000>] 0xffff880146309fff [ 1013.459871] RSP: ffffffff8165e919:ffff88014780f408 EFLAGS: 00010046 [ 1013.477909] RAX: 0000000000000001 RBX: ffffffff81104020 RCX: 0000000000000000 [ 1013.499458] RDX: ffff880148008ea8 RSI: ffffffff8131ef40 RDI: ffffffff82203b20 [ 1013.521612] RBP: ffffffff81005751 R08: 0000000000000000 R09: 0000000000000000 [ 1013.543121] R10: ffffffff82cdc318 R11: 0000000000000000 R12: ffff880145cc0000 [ 1013.564614] R13: ffff880148008eb8 R14: 0000000000000002 R15: ffff88014780cb40 [ 1013.586108] FS: 0000000000000000(0000) GS:ffff880148000000(0000) knlGS:0000000000000000 [ 1013.609458] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1013.627420] CR2: 0000000000000002 CR3: 0000000141f10000 CR4: 00000000001407e0 [ 1013.649051] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1013.670724] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1013.692376] Process kworker/2:1 (pid: 112, threadinfo ffff88013fe0e000, task ffff88014020a6a0) [ 1013.717028] Stack: [ 1013.724131] ffff88014780f570 ffff880145cc0000 0000400000004000 0000000000000000 [ 1013.745918] cccccccccccccccc ffff88014780cca8 ffffffff811072bb ffffffff81651627 [ 1013.767870] ffffffff8118f8a7 ffffffff811072bb ffffffff81f2b6c5 ffffffff81f11bdb [ 1013.790021] Call Trace: [ 1013.800701] Code: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a <e7> d7 64 81 ff ff ff ff 01 00 00 00 00 00 00 00 65 d9 64 81 ff [ 1013.861443] RIP [<ffff88014630a000>] 0xffff880146309fff [ 1013.884466] RSP <ffff88014780f408> [ 1013.901507] CR2: 0000000000000002 The solution was to reuse the NMI functions that change the IDT table to make the debug stack keep its current stack (in kernel mode) when hitting a breakpoint: call debug_stack_set_zero TRACE_IRQS_ON call debug_stack_reset If the TRACE_IRQS_ON happens to hit a breakpoint then it will keep the current stack and not crash the box. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-31 23:12:22 -04:00
Steven Rostedt	f8988175fd	x86: Allow nesting of the debug stack IDT setting When the NMI handler runs, it checks if it preempted a debug handler and if that handler is using the debug stack. If it is, it changes the IDT table not to update the stack, otherwise it will reset the debug stack and corrupt the debug handler it preempted. Now that ftrace uses breakpoints to change functions from nops to callers, many more places may hit a breakpoint. Unfortunately this includes some of the calls that lockdep performs. Which causes issues with the debug stack. It too needs to change the debug stack before tracing (if called from the debug handler). Allow the debug_stack_set_zero() and debug_stack_reset() to be nested so that the debug handlers can take advantage of them too. [ Used this_cpu_*() over __get_cpu_var() as suggested by H. Peter Anvin ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-31 23:12:21 -04:00
Steven Rostedt	c0525a6972	x86: Reset the debug_stack update counter When an NMI goes off and it sees that it preempted the debug stack, to keep the debug stack safe, it changes the IDT to point to one that does not modify the stack on breakpoint (to allow breakpoints in NMIs). But the variable that gets set to know to undo it on exit never gets cleared on exit. Thus every NMI will reset it on exit the first time it is done even if it does not need to be reset. [ Added H. Peter Anvin's suggestion to use this_cpu_read/write ] Cc: <stable@vger.kernel.org> # v3.3 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-31 23:12:20 -04:00
Steven Rostedt	8a4d0a687a	ftrace: Use breakpoint method to update ftrace caller On boot up and module load, it is fine to modify the code directly, without the use of breakpoints. This is because boot up modification is done before SMP is initialized, thus the modification is serial, and module load is done before the module executes. But after that we must use a SMP safe method to modify running code. Otherwise, if we are running the function tracer and update its function (by starting off the stack tracer, or perf tracing) the change of the function called by the ftrace trampoline is done directly. If this is being executed on another CPU, that CPU may take a GPF and crash the kernel. The breakpoint method is used to change the nops at all the functions, but the change of the ftrace callback handler itself was still using a direct modification. If tracing was enabled and the function callback was changed then another CPU could fault if it was currently calling the original callback. This modification must use the breakpoint method too. Note, the direct method is still used for boot up and module load. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-31 23:12:19 -04:00
Steven Rostedt	a192cd0413	ftrace: Synchronize variable setting with breakpoints When the function tracer starts modifying the code via breakpoints it sets a variable (modifying_ftrace_code) to inform the breakpoint handler to call the ftrace int3 code. But there's no synchronization between setting this code and the handler, thus it is possible for the handler to be called on another CPU before it sees the variable. This will cause a kernel crash as the int3 handler will not know what to do with it. I originally added smp_mb()'s to force the visibility of the variable but H. Peter Anvin suggested that I just make it atomic. [ Added comments as suggested by Peter Zijlstra ] Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-31 23:12:17 -04:00
Linus Torvalds	fb21affa49	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull second pile of signal handling patches from Al Viro: "This one is just task_work_add() series + remaining prereqs for it. There probably will be another pull request from that tree this cycle - at least for helpers, to get them out of the way for per-arch fixes remaining in the tree." Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile had brought in commit `97fd75b7b8` ("kernel/irq/manage.c: use the pr_foo() infrastructure to prefix printks") which changed one of the pr_err() calls that this merge moves around. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: keys: kill task_struct->replacement_session_keyring keys: kill the dummy key_replace_session_keyring() keys: change keyctl_session_to_parent() to use task_work_add() genirq: reimplement exit_irq_thread() hook via task_work_add() task_work_add: generic process-context callbacks avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers parisc: need to check NOTIFY_RESUME when exiting from syscall move key_repace_session_keyring() into tracehook_notify_resume() TIF_NOTIFY_RESUME is defined on all targets now	2012-05-31 18:47:30 -07:00
Linus Torvalds	2d117403b3	One more mce cleanup before the 3.5 merge window closes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPxpZ1AAoJEKurIx+X31iBACgP/jZiXAdu9JjYqvPA8/ACsntt S0fpbn40kKftn58x6Ddqemu/rg8Iy/ezsB7Fm93TYfBzDb8RGdkSiLj/KoO39Jzy Tmod/vJ0XMsBFgDob5GSKSOYMsnkO7//6jtnM09A8wDlxUwE3LV/3/84EqOrQH0n LR3Ltd6wnCc/HVasdDZX/814QcHe5KSoFMY2jUHWf0suOKcI22X57PQt9831bKky tvvsqFKSaOaerW9F1atwB2Qx37f0pUCu/Qo4KmBB0EVwSapRpHSDu657byft7VWQ FJ1eLfZF7lnVaYHxyCqz+wgTVBTsBPDt1xGkIJqSMMHtziG5iP3m0NSYLCJ5XJYx AcmF55hIx4yMCrIdiKc7wWs2K4U59+FiGJ2LgFCBZ3XLJoAuZnhf+WOh4ws/wi/j qDk9vK8KG3VBYIwZnTWb6N1bRtzznp1ElXjZR1byh/Cu2Ne/EWL64VydNT7ts0Ga 3mGF88keAlw04qHYtU/7y5WrHRKGV5LBjGujVV2e2a5dC6p7LO11UCLptEv11DWS LcbIegbrimWmMxVScJQkL12GEzZKHpZZvrFRIKiWkTA15R6R1OTC7VrywA2GjDbd h5mWKd7zV6Ankjmq9SuvnA1UazhC/r+uQ58INVpFVM+aFor32HVd6L7VRns7sv2Z WbQNYxv5O+D3J2K+dQNy =4tLS -----END PGP SIGNATURE----- Merge tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull mce cleanup from Tony Luck: "One more mce cleanup before the 3.5 merge window closes" * tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: x86/mce: Cleanup timer mess	2012-05-31 10:53:37 -07:00
Thomas Gleixner	82f7af09e6	x86/mce: Cleanup timer mess Use unsigned long for dealing with jiffies not int. Rename the callback to something sensible. Use __this_cpu_read/write for accessing per cpu data. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-05-30 14:40:01 -07:00
zhenzhong.duan	2da06af810	x86, mtrr: Fix a type overflow in range_to_mtrr func When boot on sun G5+ with 4T mem, see an overflow in mtrr cleanup as below. BADgran_size: 2G chunk_size: 2G num_reg: 10 lose cover RAM: -18014398505283592M This is because 1<<31 sign extended. Use an unsigned long constant to fix it. Useful for mem larger than or equal to 4T. -v2: Use 64bit constant instead of explicit type conversion as suggested by Yinghai. Description updated too. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Link: http://lkml.kernel.org/r/4FC5A77F.6060505@oracle.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-05-30 14:37:00 -07:00
H. Peter Anvin	bbd771474e	Merge branch 'x86/trampoline' into x86/urgent x86/trampoline contains an urgent commit which is necessarily on a newer baseline. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-05-30 12:11:32 -07:00
Ingo Molnar	403e1c5b74	Merge branch 'x86/mce' into x86/urgent Merge in these fixlets. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-30 14:12:06 +02:00
Peter Zijlstra	9f646389aa	sched/x86: Use cpu_llc_shared_mask(cpu) for coregroup_mask Commit commit `8e7fbcbc2` ("sched: Remove stale power aware scheduling remnants and dysfunctional knobs") made a boo-boo with removing the power aware scheduling muck from the x86 topology bits. We should unconditionally use the llc_shared mask for multi-core. Reported-and-tested-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Link: http://lkml.kernel.org/n/tip-lsksc2kfyeveb13avh327p0d@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-30 11:05:44 +02:00
Linus Torvalds	731a7378b8	Merge branch 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 trampoline rework from H. Peter Anvin: "This code reworks all the "trampoline"/"realmode" code (various bits that need to live in the first megabyte of memory, most but not all of which runs in real mode at some point) in the kernel into a single object. The main reason for doing this is that it eliminates the last place in the kernel where we needed pages to be mapped RWX. This code separates all that code into proper R/RW/RX pages." Fix up conflicts in arch/x86/kernel/Makefile (mca removed next to reboot code), and arch/x86/kernel/reboot.c (reboot code moved around in one branch, modified in this one), and arch/x86/tools/relocs.c (mostly same code came in earlier due to working around the ld bugs just before the 3.4 release). Also remove stale x86-relocs entry from scripts/.gitignore as per Peter Anvin. * commit '61f5446169046c217a5479517edac3a890c3bee7': (36 commits) x86, realmode: Move end signature into header.S x86, relocs: When printing an error, say relative or absolute x86, relocs: More relocations which may end up as absolute x86, relocs: Workaround for binutils 2.22.52.0.1 section bug xen-acpi-processor: Add missing #include <xen/xen.h> acpi, bgrd: Add missing <linux/io.h> to drivers/acpi/bgrt.c x86, realmode: Change EFER to a single u64 field x86, realmode: Move kernel/realmode.c to realmode/init.c x86, realmode: Move not-common bits out of trampoline_common.S x86, realmode: Mask out EFER.LMA when saving trampoline EFER x86, realmode: Fix no cache bits test in reboot_32.S x86, realmode: Make sure all generated files are listed in targets x86, realmode: build fix: remove duplicate build x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline x86, realmode: fixes compilation issue in tboot.c x86, realmode: move relocs from scripts/ to arch/x86/tools x86, realmode: header for trampoline code x86, realmode: flattened rm hierachy x86, realmode: don't copy real_mode_header x86, realmode: fix 64-bit wakeup sequence ...	2012-05-29 20:14:53 -07:00
Bjorn Helgaas	365811d6f9	x86: print physical addresses consistently with other parts of kernel Print physical address info in a style consistent with the %pR style used elsewhere in the kernel. For example: -found SMP MP-table at [ffff8800000fce90] fce90 +found SMP MP-table at [mem 0x000fce90-0x000fce9f] mapped at [ffff8800000fce90] -initial memory mapped : 0 - 20000000 +initial memory mapped: [mem 0x00000000-0x1fffffff] -Base memory trampoline at [ffff88000009c000] 9c000 size 8192 +Base memory trampoline [mem 0x0009c000-0x0009dfff] mapped at [ffff88000009c000] -SRAT: Node 0 PXM 0 0-80000000 +SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-29 16:22:21 -07:00
Bjorn Helgaas	91eb0f67c3	x86: print e820 physical addresses consistently with other parts of kernel Print physical address info in a style consistent with the %pR style used elsewhere in the kernel. For example: -BIOS-provided physical RAM map: +e820: BIOS-provided physical RAM map: - BIOS-e820: 0000000000000100 - 000000000009e000 (usable) +BIOS-e820: [mem 0x0000000000000100-0x000000000009dfff] usable -Allocating PCI resources starting at 90000000 (gap: 90000000:6ed1c000) +e820: [mem 0x90000000-0xfed1bfff] available for PCI devices -reserve RAM buffer: 000000000009e000 - 000000000009ffff +e820: reserve RAM buffer [mem 0x0009e000-0x0009ffff] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-29 16:22:20 -07:00
Linus Torvalds	786f02b719	x86/mce merge window patches (including two that make error_context() checks less sucky) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPv7OiAAoJEKurIx+X31iBQxEP/RV8YO4nrozhHY597qabzfJc 4YoCOma+wUyhXPDmZI80XvrIlcCq7TEJL1HTaAyA5rnyYvRHpM+uXDCRmbJDI4e0 gA42/Y8+lbmR6BLY8sdptCXIWxw/d8wYEKK2BgNsPhkJxODGW/gVAws93erist/v yepq+GwI0QGAeRlO6AYgE7NwQmHXK5AdfH3phHUTVABVIUGH5+Zp6471FTB+hYy5 aNvUnL0hw8vxrbpDL/Le359etPqC6wsELCIQ9wVtWCD0/UJM6Yd3j0+CKQ7q/KHU 7zMcP+OCGTJ3koMhEbFIOnxuswWDGq5y/qIzSXMEEemGqgxqFUvX3wUeZ3HFAFNx nJ7ZaA813t7Bud4G4WwESxMGQpxI7FTvxnF1ow3IlRsMtV4ffvAS9xvWi0GQJrVY xixK7G87PGAm6fP9Zbb/lQlRO8gD498j4rfI9MOsUuY9QgFNcH2eg6c4O0iHDpos WkSgUaM49Q610JslrxsXp+BZZLBF/wbcjcFiQGFAWOIKTKgRQ99+dXAQY7fw9CIf /wNl9MkbOvJdPL9OfLTmAYAMXyaXbOX8qcvItwqcBsUT0AV863NdIXtS4BXBOrMs 5u16CDX1ieFAlA2dzhynvE0Zd1Ws6wfe5W/BgtQ+H175uHFr8pHAxsBTX8GSNrXG /bSWWrR3CIBRHoWCJMmH =kG4e -----END PGP SIGNATURE----- Merge tag 'x86-mce-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull x86/mce merge window patches from Tony Luck: "Including two that make error_context() checks less sucky" * tag 'x86-mce-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: x86/mce: Add instruction recovery signatures to mce-severity table x86/mce: Fix check for processor context when machine check was taken. MCE: Fix vm86 handling for 32bit mce handler x86/mce Add validation check before GHES error is recorded x86/mce: Avoid reading every machine check bank register twice.	2012-05-25 16:14:12 -07:00
Linus Torvalds	d484864dd9	Merge branch 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping Pull CMA and ARM DMA-mapping updates from Marek Szyprowski: "These patches contain two major updates for DMA mapping subsystem (mainly for ARM architecture). First one is Contiguous Memory Allocator (CMA) which makes it possible for device drivers to allocate big contiguous chunks of memory after the system has booted. The main difference from the similar frameworks is the fact that CMA allows to transparently reuse the memory region reserved for the big chunk allocation as a system memory, so no memory is wasted when no big chunk is allocated. Once the alloc request is issued, the framework migrates system pages to create space for the required big chunk of physically contiguous memory. For more information one can refer to nice LWN articles: - 'A reworked contiguous memory allocator': http://lwn.net/Articles/447405/ - 'CMA and ARM': http://lwn.net/Articles/450286/ - 'A deep dive into CMA': http://lwn.net/Articles/486301/ - and the following thread with the patches and links to all previous versions: https://lkml.org/lkml/2012/4/3/204 The main client for this new framework is ARM DMA-mapping subsystem. The second part provides a complete redesign in ARM DMA-mapping subsystem. The core implementation has been changed to use common struct dma_map_ops based infrastructure with the recent updates for new dma attributes merged in v3.4-rc2. This allows to use more than one implementation of dma-mapping calls and change/select them on the struct device basis. The first client of this new infractructure is dmabounce implementation which has been completely cut out of the core, common code. The last patch of this redesign update introduces a new, experimental implementation of dma-mapping calls on top of generic IOMMU framework. This lets ARM sub-platform to transparently use IOMMU for DMA-mapping calls if one provides required IOMMU hardware. For more information please refer to the following thread: http://www.spinics.net/lists/arm-kernel/msg175729.html The last patch merges changes from both updates and provides a resolution for the conflicts which cannot be avoided when patches have been applied on the same files (mainly arch/arm/mm/dma-mapping.c)." Acked by Andrew Morton <akpm@linux-foundation.org>: "Yup, this one please. It's had much work, plenty of review and I think even Russell is happy with it." * 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping: (28 commits) ARM: dma-mapping: use PMD size for section unmap cma: fix migration mode ARM: integrate CMA with DMA-mapping subsystem X86: integrate CMA with DMA-mapping subsystem drivers: add Contiguous Memory Allocator mm: trigger page reclaim in alloc_contig_range() to stabilise watermarks mm: extract reclaim code from __alloc_pages_direct_reclaim() mm: Serialize access to min_free_kbytes mm: page_isolation: MIGRATE_CMA isolation functions added mm: mmzone: MIGRATE_CMA migration type added mm: page_alloc: change fallbacks array handling mm: page_alloc: introduce alloc_contig_range() mm: compaction: export some of the functions mm: compaction: introduce isolate_freepages_range() mm: compaction: introduce map_pages() mm: compaction: introduce isolate_migratepages_range() mm: page_alloc: remove trailing whitespace ARM: dma-mapping: add support for IOMMU mapper ARM: dma-mapping: use alloc, mmap, free from dma_ops ARM: dma-mapping: remove redundant code and do the cleanup ... Conflicts: arch/x86/include/asm/dma-mapping.h	2012-05-25 09:18:59 -07:00
Jan Beulich	1b38a3a10f	x86: hpet: Fix copy-and-paste mistake in earlier change This fixes an oversight in `396e2c6fed` ("x86: Clear HPET configuration registers on startup"), noticed by Thomas Gleixner. Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/4FBF7DA902000078000861EE@nat28.tlf.novell.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-05-25 15:32:29 +02:00
Linus Torvalds	07acfc2a93	Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM changes from Avi Kivity: "Changes include additional instruction emulation, page-crossing MMIO, faster dirty logging, preventing the watchdog from killing a stopped guest, module autoload, a new MSI ABI, and some minor optimizations and fixes. Outside x86 we have a small s390 and a very large ppc update. Regarding the new (for kvm) rebaseless workflow, some of the patches that were merged before we switch trees had to be rebased, while others are true pulls. In either case the signoffs should be correct now." Fix up trivial conflicts in Documentation/feature-removal-schedule.txt arch/powerpc/kvm/book3s_segment.S and arch/x86/include/asm/kvm_para.h. I suspect the kvm_para.h resolution ends up doing the "do I have cpuid" check effectively twice (it was done differently in two different commits), but better safe than sorry ;) * 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (125 commits) KVM: make asm-generic/kvm_para.h have an ifdef __KERNEL__ block KVM: s390: onereg for timer related registers KVM: s390: epoch difference and TOD programmable field KVM: s390: KVM_GET/SET_ONEREG for s390 KVM: s390: add capability indicating COW support KVM: Fix mmu_reload() clash with nested vmx event injection KVM: MMU: Don't use RCU for lockless shadow walking KVM: VMX: Optimize %ds, %es reload KVM: VMX: Fix %ds/%es clobber KVM: x86 emulator: convert bsf/bsr instructions to emulate_2op_SrcV_nobyte() KVM: VMX: unlike vmcs on fail path KVM: PPC: Emulator: clean up SPR reads and writes KVM: PPC: Emulator: clean up instruction parsing kvm/powerpc: Add new ioctl to retreive server MMU infos kvm/book3s: Make kernel emulated H_PUT_TCE available for "PR" KVM KVM: PPC: bookehv: Fix r8/r13 storing in level exception handler KVM: PPC: Book3S: Enable IRQs during exit handling KVM: PPC: Fix PR KVM on POWER7 bare metal KVM: PPC: Fix stbux emulation KVM: PPC: bookehv: Use lwz/stw instead of PPC_LL/PPC_STL for 32-bit fields ...	2012-05-24 16:17:30 -07:00
Jiang Liu	f841d792e3	x86: Return IRQ_SET_MASK_OK_NOCOPY from irq affinity functions The interrupt chip irq_set_affinity() functions copy the affinity mask to irq_data->affinity but return 0, i.e. IRQ_SET_MASK_OK. IRQ_SET_MASK_OK causes the core code to do another redundant copy. Return IRQ_SET_MASK_OK_NOCOPY to avoid this. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Cliff Wickman <cpw@sgi.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Keping Chen <chenkeping@huawei.com> Link: http://lkml.kernel.org/r/1333120296-13563-4-git-send-email-jiang.liu@huawei.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-05-24 23:16:33 +02:00
Feng Tang	151766fce8	Revert "x86/platform: Add a wallclock_init func to x86_platforms ops" This reverts commit `cf8ff6b6ab`. Just found this commit is a function duplicatation of commit `6b617e22` "x86/platform: Add a wallclock_init func to x86_init.timers ops". Let's revert it and sorry for the noise. Signed-off-by: Feng Tang <feng.tang@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Dirk Brandewie <dirk.brandewie@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-05-24 23:16:14 +02:00
Linus Torvalds	c7523a7c88	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner. Various trivial conflict fixups in arch Kconfig due to addition of unrelated entries nearby. And one slightly more subtle one for sparc32 (new user of GENERIC_CLOCKEVENTS), fixed up as per Thomas. * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) timekeeping: Fix a few minor newline issues. time: remove obsolete declaration ntp: Fix a stale comment and a few stray newlines. ntp: Correct TAI offset during leap second timers: Fixup the Kconfig consolidation fallout x86: Use generic time config unicore32: Use generic time config um: Use generic time config tile: Use generic time config sparc: Use: generic time config sh: Use generic time config score: Use generic time config s390: Use generic time config openrisc: Use generic time config powerpc: Use generic time config mn10300: Use generic time config mips: Use generic time config microblaze: Use generic time config m68k: Use generic time config m32r: Use generic time config ...	2012-05-24 13:29:46 -07:00
Linus Torvalds	654443e20d	Merge branch 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull user-space probe instrumentation from Ingo Molnar: "The uprobes code originates from SystemTap and has been used for years in Fedora and RHEL kernels. This version is much rewritten, reviews from PeterZ, Oleg and myself shaped the end result. This tree includes uprobes support in 'perf probe' - but SystemTap (and other tools) can take advantage of user probe points as well. Sample usage of uprobes via perf, for example to profile malloc() calls without modifying user-space binaries. First boot a new kernel with CONFIG_UPROBE_EVENT=y enabled. If you don't know which function you want to probe you can pick one from 'perf top' or can get a list all functions that can be probed within libc (binaries can be specified as well): $ perf probe -F -x /lib/libc.so.6 To probe libc's malloc(): $ perf probe -x /lib64/libc.so.6 malloc Added new event: probe_libc:malloc (on 0x7eac0) You can now use it in all perf tools, such as: perf record -e probe_libc:malloc -aR sleep 1 Make use of it to create a call graph (as the flat profile is going to look very boring): $ perf record -e probe_libc:malloc -gR make [ perf record: Woken up 173 times to write data ] [ perf record: Captured and wrote 44.190 MB perf.data (~1930712 $ perf report \| less 32.03% git libc-2.15.so [.] malloc \| --- malloc 29.49% cc1 libc-2.15.so [.] malloc \| --- malloc \| \|--0.95%-- 0x208eb1000000000 \| \|--0.63%-- htab_traverse_noresize 11.04% as libc-2.15.so [.] malloc \| --- malloc \| 7.15% ld libc-2.15.so [.] malloc \| --- malloc \| 5.07% sh libc-2.15.so [.] malloc \| --- malloc \| 4.99% python-config libc-2.15.so [.] malloc \| --- malloc \| 4.54% make libc-2.15.so [.] malloc \| --- malloc \| \|--7.34%-- glob \| \| \| \|--93.18%-- 0x41588f \| \| \| --6.82%-- glob \| 0x41588f ... Or: $ perf report -g flat \| less # Overhead Command Shared Object Symbol # ........ ............. ............. .......... # 32.03% git libc-2.15.so [.] malloc 27.19% malloc 29.49% cc1 libc-2.15.so [.] malloc 24.77% malloc 11.04% as libc-2.15.so [.] malloc 11.02% malloc 7.15% ld libc-2.15.so [.] malloc 6.57% malloc ... The core uprobes design is fairly straightforward: uprobes probe points register themselves at (inode:offset) addresses of libraries/binaries, after which all existing (or new) vmas that map that address will have a software breakpoint injected at that address. vmas are COW-ed to preserve original content. The probe points are kept in an rbtree. If user-space executes the probed inode:offset instruction address then an event is generated which can be recovered from the regular perf event channels and mmap-ed ring-buffer. Multiple probes at the same address are supported, they create a dynamic callback list of event consumers. The basic model is further complicated by the XOL speedup: the original instruction that is probed is copied (in an architecture specific fashion) and executed out of line when the probe triggers. The XOL area is a single vma per process, with a fixed number of entries (which limits probe execution parallelism). The API: uprobes are installed/removed via /sys/kernel/debug/tracing/uprobe_events, the API is integrated to align with the kprobes interface as much as possible, but is separate to it. Injecting a probe point is privileged operation, which can be relaxed by setting perf_paranoid to -1. You can use multiple probes as well and mix them with kprobes and regular PMU events or tracepoints, when instrumenting a task." Fix up trivial conflicts in mm/memory.c due to previous cleanup of unmap_single_vma(). * 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) perf probe: Detect probe target when m/x options are absent perf probe: Provide perf interface for uprobes tracing: Fix kconfig warning due to a typo tracing: Provide trace events interface for uprobes tracing: Extract out common code for kprobes/uprobes trace events tracing: Modify is_delete, is_return from int to bool uprobes/core: Decrement uprobe count before the pages are unmapped uprobes/core: Make background page replacement logic account for rss_stat counters uprobes/core: Optimize probe hits with the help of a counter uprobes/core: Allocate XOL slots for uprobes use uprobes/core: Handle breakpoint and singlestep exceptions uprobes/core: Rename bkpt to swbp uprobes/core: Make order of function parameters consistent across functions uprobes/core: Make macro names consistent uprobes: Update copyright notices uprobes/core: Move insn to arch specific structure uprobes/core: Remove uprobe_opcode_sz uprobes/core: Make instruction tables volatile uprobes: Move to kernel/events/ uprobes/core: Clean up, refactor and improve the code ...	2012-05-24 11:39:34 -07:00
Al Viro	a42c6ded82	move key_repace_session_keyring() into tracehook_notify_resume() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-23 22:09:20 -04:00
Linus Torvalds	f9369910a6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull first series of signal handling cleanups from Al Viro: "This is just the first part of the queue (about a half of it); assorted fixes all over the place in signal handling. This one ends with all sigsuspend() implementations switched to generic one (->saved_sigmask-based). With this, a bunch of assorted old buglets are fixed and most of the missing bits of NOTIFY_RESUME hookup are in place. Two more fixes sit in arm and um trees respectively, and there's a couple of broken ones that need obvious fixes - parisc and avr32 check TIF_NOTIFY_RESUME only on one of two codepaths; fixes for that will happen in the next series" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (55 commits) unicore32: if there's no handler we need to restore sigmask, syscall or no syscall xtensa: add handling of TIF_NOTIFY_RESUME microblaze: drop 'oldset' argument of do_notify_resume() microblaze: handle TIF_NOTIFY_RESUME score: add handling of NOTIFY_RESUME to do_notify_resume() m68k: add TIF_NOTIFY_RESUME and handle it. sparc: kill ancient comment in sparc_sigaction() h8300: missing checks of __get_user()/__put_user() return values frv: missing checks of __get_user()/__put_user() return values cris: missing checks of __get_user()/__put_user() return values powerpc: missing checks of __get_user()/__put_user() return values sh: missing checks of __get_user()/__put_user() return values sparc: missing checks of __get_user()/__put_user() return values avr32: struct old_sigaction is never used m32r: struct old_sigaction is never used xtensa: xtensa_sigaction doesn't exist alpha: tidy signal delivery up score: don't open-code force_sigsegv() cris: don't open-code force_sigsegv() blackfin: don't open-code force_sigsegv() ...	2012-05-23 18:11:45 -07:00
Linus Torvalds	d5b4bb4d10	Merge branch 'delete-mca' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull the MCA deletion branch from Paul Gortmaker: "It was good that we could support MCA machines back in the day, but realistically, nobody is using them anymore. They were mostly limited to 386-sx 16MHz CPU and some 486 class machines and never more than 64MB of RAM. Even the enthusiast hobbyist community seems to have dried up close to ten years ago, based on what you can find searching various websites dedicated to the relatively short lived hardware. So lets remove the support relating to CONFIG_MCA. There is no point carrying this forward, wasting cycles doing routine maintenance on it; wasting allyesconfig build time on validating it, wasting I/O on git grep'ping over it, and so on." Let's see if anybody screams. It generally has compiled, and James Bottomley pointed out that there was a MCA extension from NCR that allowed for up to 4GB of memory and PPro-class machines. So in theory there may be users out there. But even James (technically listed as a maintainer) doesn't actually have a system, and while Alan Cox claims to have a machine in his cellar that he offered to anybody who wants to take it off his hands, he didn't argue for keeping MCA support either. So we could bring it back. But somebody had better speak up and talk about how they have actually been using said MCA hardware with modern kernels for us to do that. And David already took the patch to delete all the networking driver code (commit `a5e371f61a`: "drivers/net: delete all code/drivers depending on CONFIG_MCA"). * 'delete-mca' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: MCA: delete all remaining traces of microchannel bus support. scsi: delete the MCA specific drivers and driver code serial: delete the MCA specific 8250 support. arm: remove ability to select CONFIG_MCA	2012-05-23 17:12:06 -07:00
Tony Luck	37c3459b67	x86/mce: Add instruction recovery signatures to mce-severity table Instruction recovery cases are very similar to the data recovery one we already have. Just trade out for a new MCACOD value. Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-05-23 14:24:11 -07:00
Tony Luck	875e26648c	x86/mce: Fix check for processor context when machine check was taken. Linus pointed out that there was no value is checking whether m->ip was zero - because zero is a legimate value. If we have a reliable (or faked in the VM86 case) "m->cs" we can use it to tell whether we were in user mode or kernelwhen the machine check hit. Reported-by: Linus Torvalds <torvalds@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-05-23 14:22:44 -07:00
Andi Kleen	a129a7c845	MCE: Fix vm86 handling for 32bit mce handler When running on 32bit the mce handler could misinterpret vm86 mode as ring 0. This can affect whether it does recovery or not; it was possible to panic when recovery was actually possible. Fix this by always forcing vm86 to look like ring 3. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-05-23 14:22:37 -07:00
Linus Torvalds	56edab3159	Merge branches 'perf-urgent-for-linus' and 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: - Leftover AMD PMU driver fix fix from the end of the v3.4 stabilization cycle. - Late tools/perf/ changes that missed the first round: * endianness fixes * event parsing improvements * libtraceevent fixes factored out from trace-cmd * perl scripting engine fixes related to libtraceevent, * testcase improvements * perf inject / pipe mode fixes * plus a kernel side fix * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Update event scheduling constraints for AMD family 15h models * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "sched, perf: Use a single callback into the scheduler" perf evlist: Show event attribute details perf tools: Bump default sample freq to 4 kHz perf buildid-list: Work better with pipe mode perf tools: Fix piped mode read code perf inject: Fix broken perf inject -b perf tools: rename HEADER_TRACE_INFO to HEADER_TRACING_DATA perf tools: Add union u64_swap type for swapping u64 data perf tools: Carry perf_event_attr bitfield throught different endians perf record: Fix documentation for branch stack sampling perf target: Add cpu flag to sample_type if target has cpu perf tools: Always try to build libtraceevent perf tools: Rename libparsevent to libtraceevent in Makefile perf script: Rename struct event to struct event_format in perl engine perf script: Explicitly handle known default print arg type perf tools: Add hardcoded name term for pmu events perf tools: Separate 'mem:' event scanner bits perf tools: Use allocated list for each parsed event perf tools: Add support for displaying event parser debug info perf test: Move parse event automated tests to separated object	2012-05-23 12:12:49 -07:00
Linus Torvalds	2335a8366f	Merge branch 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 reboot changes from Ingo Molnar: "The biggest change is a gentler method of rebooting/stopping via IRQs first and then via NMIs. There are several cleanups in the tree as well." * 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/reboot: Update nonmi_ipi parameter x86/reboot: Use NMI to assist in shutting down if IRQ fails Revert "x86, reboot: Use NMI instead of REBOOT_VECTOR to stop cpus" x86/reboot: Clean up coding style x86/reboot: Reduce to a single DMI table for reboot quirks	2012-05-23 11:31:22 -07:00
Linus Torvalds	44bc40e148	Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform changes from Ingo Molnar: "This tree includes assorted platform driver updates and a preparatory series for a platform with custom DMA remapping semantics (sta2x11 I/O hub)." * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vsmp: Fix number of CPUs when vsmp is disabled keyboard: Use BIOS Keyboard variable to set Numlock x86/olpc/xo1/sci: Report RTC wakeup events x86/olpc/xo1/sci: Produce wakeup events for buttons and switches x86, platform: Initial support for sta2x11 I/O hub x86: Introduce CONFIG_X86_DMA_REMAP x86-32: Introduce CONFIG_X86_DEV_DMA_OPS	2012-05-23 11:16:40 -07:00
Linus Torvalds	70311aaa8a	Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MCE updates from Ingo Molnar: "This tree updates/fixes MCE hardware support, it makes the APIC LVT thresholding interrupt optional because a subset of AMD F15h models don't support it." * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, MCE, AMD: Disable error thresholding bank 4 on some models x86, MCE, AMD: Hide interrupt_enable sysfs node x86, MCE, AMD: Make APIC LVT thresholding interrupt optional	2012-05-23 11:01:52 -07:00
Linus Torvalds	ec0d7f18ab	Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull fpu state cleanups from Ingo Molnar: "This tree streamlines further aspects of FPU handling by eliminating the prepare_to_copy() complication and moving that logic to arch_dup_task_struct(). It also fixes the FPU dumps in threaded core dumps, removes and old (and now invalid) assumption plus micro-optimizes the exit path by avoiding an FPU save for dead tasks." Fixed up trivial add-add conflict in arch/sh/kernel/process.c that came in because we now do the FPU handling in arch_dup_task_struct() rather than the legacy (and now gone) prepare_to_copy(). * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, fpu: drop the fpu state during thread exit x86, xsave: remove thread_has_fpu() bug check in __sanitize_i387_state() coredump: ensure the fpu state is flushed for proper multi-threaded core dump fork: move the real prepare_to_copy() users to arch_dup_task_struct()	2012-05-23 10:59:07 -07:00
Linus Torvalds	269af9a1a0	Merge branch 'x86-extable-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull exception table generation updates from Ingo Molnar: "The biggest change here is to allow the build-time sorting of the exception table, to speed up booting. This is achieved by the architecture enabling BUILDTIME_EXTABLE_SORT. This option is enabled for x86 and MIPS currently. On x86 a number of fixes and changes were needed to allow build-time sorting of the exception table, in particular a relocation invariant exception table format was needed. This required the abstracting out of exception table protocol and the removal of 20 years of accumulated assumptions about the x86 exception table format. While at it, this tree also cleans up various other aspects of exception handling, such as early(er) exception handling for rdmsr_safe() et al. All in one, as the result of these changes the x86 exception code is now pretty nice and modern. As an added bonus any regressions in this code will be early and violent crashes, so if you see any of those, you'll know whom to blame!" Fix up trivial conflicts in arch/{mips,x86}/Kconfig files due to nearby modifications of other core architecture options. * 'x86-extable-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits) Revert "x86, extable: Disable presorted exception table for now" scripts/sortextable: Handle relative entries, and other cleanups x86, extable: Switch to relative exception table entries x86, extable: Disable presorted exception table for now x86, extable: Add _ASM_EXTABLE_EX() macro x86, extable: Remove open-coded exception table entries in arch/x86/ia32/ia32entry.S x86, extable: Remove open-coded exception table entries in arch/x86/include/asm/xsave.h x86, extable: Remove open-coded exception table entries in arch/x86/include/asm/kvm_host.h x86, extable: Remove the now-unused __ASM_EX_SEC macros x86, extable: Remove open-coded exception table entries in arch/x86/xen/xen-asm_32.S x86, extable: Remove open-coded exception table entries in arch/x86/um/checksum_32.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/usercopy_32.c x86, extable: Remove open-coded exception table entries in arch/x86/lib/putuser.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/getuser.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/csum-copy_64.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/copy_user_nocache_64.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/copy_user_64.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/checksum_32.S x86, extable: Remove open-coded exception table entries in arch/x86/kernel/test_rodata.c x86, extable: Remove open-coded exception table entries in arch/x86/kernel/entry_64.S ...	2012-05-23 10:44:35 -07:00
Linus Torvalds	e7b30a17c1	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/urgent branch from Ingo Molnar: "These are the fixes left over from the very end of the v3.4 stabilization cycle, plus one more fix." Ugh. Those KERN_CONT additions are just pointless. I think they came as a reaction to some of the early (broken) printk() work - but that was fixed before it was merged. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, relocs: Build clean fix x86, printk: Add missing KERN_CONT to NMI selftest x86: Fix boot on Twinhead H12Y	2012-05-23 10:21:19 -07:00
Linus Torvalds	19bec32d7f	Merge branches 'x86-asm-for-linus', 'x86-cleanups-for-linus', 'x86-cpu-for-linus', 'x86-debug-for-linus' and 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull initial trivial x86 stuff from Ingo Molnar. Various random cleanups and trivial fixes. * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86-64: Eliminate dead ia32 syscall handlers * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pci-calgary_64.c: Remove obsoleted simple_strtoul() usage x86: Don't continue booting if we can't load the specified initrd x86: kernel/dumpstack.c simple_strtoul cleanup x86: kernel/check.c simple_strtoul cleanup debug: Add CONFIG_READABLE_ASM x86: spinlock.h: Remove REG_PTR_MODE * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cache_info: Fix setup of l2/l3 ids * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Avoid double stack traces with show_regs() * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode: microcode_core.c simple_strtoul cleanup	2012-05-23 10:09:50 -07:00
Borislav Petkov	80f033610f	x86/mce: Fix 32-bit build Got bitten again by the BIT() macro: arch/x86/kernel/cpu/mcheck/mce.c: In function '__mcheck_cpu_apply_quirks': arch/x86/kernel/cpu/mcheck/mce.c:1453:6: warning: left shift count >= width of type arch/x86/kernel/cpu/mcheck/mce.c:1454:7: warning: left shift count >= width of type Fix it already. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Frank Arnold <frank.arnold@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1337684026-19740-2-git-send-email-bp@amd64.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-23 17:16:43 +02:00
Linus Torvalds	e8650a0823	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial updates from Jiri Kosina: "As usual, it's mostly typo fixes, redundant code elimination and some documentation updates." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (57 commits) edac, mips: don't change code that has been removed in edac/mips tree xtensa: Change mail addresses of Hannes Weiner and Oskar Schirmer lib: Change mail address of Oskar Schirmer net: Change mail address of Oskar Schirmer arm/m68k: Change mail address of Sebastian Hess i2c: Change mail address of Oskar Schirmer net: Fix tcp_build_and_update_options comment in struct tcp_sock atomic64_32.h: fix parameter naming mismatch Kconfig: replace "--- help ---" with "---help---" c2port: fix bogus Kconfig "default no" edac: Fix spelling errors. qla1280: Remove redundant NULL check before release_firmware() call remoteproc: remove redundant NULL check before release_firmware() qla2xxx: Remove redundant NULL check before release_firmware() call. aic94xx: Get rid of redundant NULL check before release_firmware() call tehuti: delete redundant NULL check before release_firmware() qlogic: get rid of a redundant test for NULL before call to release_firmware() bna: remove redundant NULL test before release_firmware() tg3: remove redundant NULL test before release_firmware() call typhoon: get rid of redundant conditional before all to release_firmware() ...	2012-05-22 19:22:50 -07:00
Linus Torvalds	f08b9c2f8a	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/apic changes from Ingo Molnar: "Most of the changes are about helping virtualized guest kernels achieve better performance." Fix up trivial conflicts with the iommu updates to arch/x86/kernel/apic/io_apic.c * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Implement EIO micro-optimization x86/apic: Add apic->eoi_write() callback x86/apic: Use symbolic APIC_EOI_ACK x86/apic: Fix typo EIO_ACK -> EOI_ACK and document it x86/xen/apic: Add missing #include <xen/xen.h> x86/apic: Only compile local function if used with !CONFIG_GENERIC_PENDING_IRQ x86/apic: Fix UP boot crash x86: Conditionally update time when ack-ing pending irqs xen/apic: implement io apic read with hypercall Revert "xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries'" xen/x86: Implement x86_apic_ops x86/apic: Replace io_apic_ops with x86_io_apic_ops.	2012-05-22 18:38:11 -07:00
Linus Torvalds	d79ee93de9	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "The biggest change is the cleanup/simplification of the load-balancer: instead of the current practice of architectures twiddling scheduler internal data structures and providing the scheduler domains in colorfully inconsistent ways, we now have generic scheduler code in kernel/sched/core.c:sched_init_numa() that looks at the architecture's node_distance() parameters and (while not fully trusting it) deducts a NUMA topology from it. This inevitably changes balancing behavior - hopefully for the better. There are various smaller optimizations, cleanups and fixlets as well" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Taint kernel with TAINT_WARN after sleep-in-atomic bug sched: Remove stale power aware scheduling remnants and dysfunctional knobs sched/debug: Fix printing large integers on 32-bit platforms sched/fair: Improve the ->group_imb logic sched/nohz: Fix rq->cpu_load[] calculations sched/numa: Don't scale the imbalance sched/fair: Revert sched-domain iteration breakage sched/x86: Rewrite set_cpu_sibling_map() sched/numa: Fix the new NUMA topology bits sched/numa: Rewrite the CONFIG_NUMA sched domain support sched/fair: Propagate 'struct lb_env' usage into find_busiest_group sched/fair: Add some serialization to the sched_domain load-balance walk sched/fair: Let minimally loaded cpu balance the group sched: Change rq->nr_running to unsigned int x86/numa: Check for nonsensical topologies on real hw as well x86/numa: Hard partition cpu topology masks on node boundaries x86/numa: Allow specifying node_distance() for numa=fake x86/sched: Make mwait_usable() heed to "idle=" kernel parameters properly sched: Update documentation and comments sched_rt: Avoid unnecessary dequeue and enqueue of pushable tasks in set_cpus_allowed_rt()	2012-05-22 18:27:32 -07:00
Linus Torvalds	2ff2b289a6	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "Lots of changes: - (much) improved assembly annotation support in perf report, with jump visualization, searching, navigation, visual output improvements and more. - kernel support for AMD IBS PMU hardware features. Notably 'perf record -e cycles:p' and 'perf top -e cycles:p' should work without skid now, like PEBS does on the Intel side, because it takes advantage of IBS transparently. - the libtracevents library: it is the first step towards unifying tracing tooling and perf, and it also gives a tracing library for external tools like powertop to rely on. - infrastructure: various improvements and refactoring of the UI modules and related code - infrastructure: cleanup and simplification of the profiling targets code (--uid, --pid, --tid, --cpu, --all-cpus, etc.) - tons of robustness fixes all around - various ftrace updates: speedups, cleanups, robustness improvements. - typing 'make' in tools/ will now give you a menu of projects to build and a short help text to explain what each does. - ... and lots of other changes I forgot to list. The perf record make bzImage + perf report regression you reported should be fixed." * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (166 commits) tracing: Remove kernel_lock annotations tracing: Fix initial buffer_size_kb state ring-buffer: Merge separate resize loops perf evsel: Create events initially disabled -- again perf tools: Split term type into value type and term type perf hists: Fix callchain ip printf format perf target: Add uses_mmap field ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() ftrace: Make ftrace_modify_all_code() global for archs to use ftrace: Return record ip addr for ftrace_location() ftrace: Consolidate ftrace_location() and ftrace_text_reserved() ftrace: Speed up search by skipping pages by address ftrace: Remove extra helper functions ftrace: Sort all function addresses, not just per page tracing: change CPU ring buffer state from tracing_cpumask tracing: Check return value of tracing_dentry_percpu() ring-buffer: Reset head page before running self test ring-buffer: Add integrity check at end of iter read ring-buffer: Make addition of pages in ring buffer atomic ...	2012-05-22 18:18:55 -07:00
Linus Torvalds	f5c101892f	Merge branch 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: "Contains Alex Shi's three patches to remove percpu_xxx() which overlap with this_cpu_xxx(). There shouldn't be any functional change." * 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: remove percpu_xxx() functions x86: replace percpu_xxx funcs with this_cpu_xxx net: replace percpu_xxx funcs with this_cpu_xxx or __this_cpu_xxx	2012-05-22 17:37:47 -07:00
Al Viro	68f3f16d9a	new helper: sigsuspend() guts of saved_sigmask-based sigsuspend/rt_sigsuspend. Takes kernel sigset_t *. Open-coded instances replaced with calling it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-21 23:52:30 -04:00
Linus Torvalds	cb60e3e65c	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "New notable features: - The seccomp work from Will Drewry - PR_{GET,SET}_NO_NEW_PRIVS from Andy Lutomirski - Longer security labels for Smack from Casey Schaufler - Additional ptrace restriction modes for Yama by Kees Cook" Fix up trivial context conflicts in arch/x86/Kconfig and include/linux/filter.h * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits) apparmor: fix long path failure due to disconnected path apparmor: fix profile lookup for unconfined ima: fix filename hint to reflect script interpreter name KEYS: Don't check for NULL key pointer in key_validate() Smack: allow for significantly longer Smack labels v4 gfp flags for security_inode_alloc()? Smack: recursive tramsmute Yama: replace capable() with ns_capable() TOMOYO: Accept manager programs which do not start with / . KEYS: Add invalidation support KEYS: Do LRU discard in full keyrings KEYS: Permit in-place link replacement in keyring list KEYS: Perform RCU synchronisation on keys prior to key destruction KEYS: Announce key type (un)registration KEYS: Reorganise keys Makefile KEYS: Move the key config into security/keys/Kconfig KEYS: Use the compat keyctl() syscall wrapper on Sparc64 for Sparc32 compat Yama: remove an unused variable samples/seccomp: fix dependencies on arch macros Yama: add additional ptrace scopes ...	2012-05-21 20:27:36 -07:00
Linus Torvalds	bf67f3a5c4	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp hotplug cleanups from Thomas Gleixner: "This series is merily a cleanup of code copied around in arch/* and not changing any of the real cpu hotplug horrors yet. I wish I'd had something more substantial for 3.5, but I underestimated the lurking horror..." Fix up trivial conflicts in arch/{arm,sparc,x86}/Kconfig and arch/sparc/include/asm/thread_info_32.h * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (79 commits) um: Remove leftover declaration of alloc_task_struct_node() task_allocator: Use config switches instead of magic defines sparc: Use common threadinfo allocator score: Use common threadinfo allocator sh-use-common-threadinfo-allocator mn10300: Use common threadinfo allocator powerpc: Use common threadinfo allocator mips: Use common threadinfo allocator hexagon: Use common threadinfo allocator m32r: Use common threadinfo allocator frv: Use common threadinfo allocator cris: Use common threadinfo allocator x86: Use common threadinfo allocator c6x: Use common threadinfo allocator fork: Provide kmemcache based thread_info allocator tile: Use common threadinfo allocator fork: Provide weak arch_release_[task_struct\|thread_info] functions fork: Move thread info gfp flags to header fork: Remove the weak insanity sh: Remove cpu_idle_wait() ...	2012-05-21 19:43:57 -07:00
Linus Torvalds	5ec29e3149	Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking updates from Ingo Molnar: "This update: - extends and simplifies x86 NMI callback handling code to enhance and fix the HP hw-watchdog driver - simplifies the x86 NMI callback handling code to fix a kmemcheck bug. - enhances the hung-task debugger" * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/nmi: Fix the type of the nmiaction.flags field x86/nmi: Fix page faults by nmiaction if kmemcheck is enabled x86/nmi: Add new NMI queues to deal with IO_CHK and SERR watchdog, hpwdt: Remove priority option for NMI callback hung task debugging: Inject NMI when hung and going to panic	2012-05-21 19:25:14 -07:00
Linus Torvalds	abd209b708	Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull iommu core changes from Ingo Molnar: "The IOMMU changes in this cycle are mostly about factoring out Intel-VT-d specific IRQ remapping details and introducing struct irq_remap_ops, in preparation for AMD specific hardware." * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: iommu: Fix off by one in dmar_get_fault_reason() irq_remap: Fix the 'sub_handle' uninitialized warning irq_remap: Fix UP build failure irq_remap: Fix compiler warning with CONFIG_IRQ_REMAP=y iommu: rename intr_remapping.[ch] to irq_remapping.[ch] iommu: rename intr_remapping references to irq_remapping x86, iommu/vt-d: Clean up interfaces for interrupt remapping iommu/vt-d: Convert MSI remapping setup to remap_ops iommu/vt-d: Convert free_irte into a remap_ops callback iommu/vt-d: Convert IR set_affinity function to remap_ops iommu/vt-d: Convert IR ioapic-setup to use remap_ops iommu/vt-d: Convert missing apic.c intr-remapping call to remap_ops iommu/vt-d: Make intr-remapping initialization generic iommu: Rename intr_remapping files to intel_intr_remapping	2012-05-21 19:23:41 -07:00
H. Peter Anvin	d13a822e6d	Merge commit 'v3.4' into x86/urgent	2012-05-21 12:17:50 -07:00
Sasha Levin	29d679ffd8	x86, printk: Add missing KERN_CONT to NMI selftest Fix this behaviour: ---------------- \| NMI testsuite: -------------------- remote IPI: ok \| local IPI: ok \| Revealed due to a new modification to printk(). Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Link: http://lkml.kernel.org/r/1336492573-17530-3-git-send-email-levinsasha928@gmail.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-05-21 10:13:04 -07:00
Marek Szyprowski	0a2b9a6ea9	X86: integrate CMA with DMA-mapping subsystem This patch adds support for CMA to dma-mapping subsystem for x86 architecture that uses common pci-dma/pci-nommu implementation. This allows to test CMA on KVM/QEMU and a lot of common x86 boxes. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> CC: Michal Nazarewicz <mina86@mina86.com> Acked-by: Arnd Bergmann <arnd@arndb.de>	2012-05-21 15:09:38 +02:00
Shuah Khan	74bc491795	x86/pci-calgary_64.c: Remove obsoleted simple_strtoul() usage Change calgary_parse_options() to call kstrtoul() instead of calling obsoleted simple_strtoul(). Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Acked-by: Muli Ben-Yehuda <muli@cs.technion.ac.il> Cc: jdmason@kudzu.us Link: http://lkml.kernel.org/r/1337556268.3126.5.camel@lorien2 Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-21 10:29:40 +02:00
Ingo Molnar	bb27f55eb9	Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Fixes for perf/core: - Rename some perf_target methods to avoid double negation, from Namhyung Kim. - Revert change to use per task events with inheritance, from Namhyung Kim. - Events should start disabled till children starts running, from David Ahern. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-21 09:17:50 +02:00
Linus Torvalds	3d9944978e	Fix machine check recovery -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPtS5qAAoJEKurIx+X31iB+bAP/1FjUa2Nd53X89HFc6DoLktA 4AshM/JENhAfSbpTfGhg10ZOuwUa8sQ85Sf6yz1CsW6mEiJK/bDFrR1g2KrmejyL owgQvV6ukPABzfB27tWyXSVVBPmLkJviedLDautVpgPEPVuqauntmpe7fW51b5pf 92MxvYZ6AzgbDIjVaXP7e+kPomvgyM1C/UEvCgoyEcw81h5dchU9NSdXNBS67JS/ uOsMiMJyoNI46haYYbyFgMq3RmpYuxTLFj7qFDlUltyjP+vIvyLs38Ae/vkRMNfV sYXWRUQlRpvqg4MDIFVZx8FWTufzTm0BMS+Be7JkXKWdF3DAksq6FprOWIxfYi+d PMxwTFeSJzTINb9n9MiLt3TmuRy3mu37QWd28qJaJciNMkYWbclPqyJmjwsuAMKg hKSy2FvewIDHTAGOkwaVjS+L8O7j3TNRIAbweFA1d1K4rt6oSfwdn6GZrAb6MTvx oV0Fe1nAyY9mucyjknBTim3RYZ9qQ7H9SjL8JoaGihSzi988MgBun+iEQOQiwjNl YJNGIv0wb31qCKFtxU0A8rA0sFdhQRoLgigJfETb5a+2ghtG323oH6ZVWTnB8bp/ g1XOpus222/iJdL2AizMiJeUNV1IZ0SeLn2GoTFQIy9Uf11Zdgu5wLJumUva8EC6 JAACbpBlYufftIn278OH =tk2M -----END PGP SIGNATURE----- Merge tag 'linus-mce-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull a machine check recovery fix from Tony Luck. I really don't like how the MCE code does some of the things it does, but this does seem to be an improvement. * tag 'linus-mce-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: x86/mce: Only restart instruction after machine check recovery if it is safe	2012-05-18 09:42:20 -07:00
Arnaldo Carvalho de Melo	16ee6576e2	Merge remote-tracking branch 'tip/perf/urgent' into perf/core Merge reason: We are going to queue up a dependent patch: "perf tools: Move parse event automated tests to separated object" That depends on: commit `e7c72d8` perf tools: Add 'G' and 'H' modifiers to event parsing Conflicts: tools/perf/builtin-stat.c Conflicted with the recent 'perf_target' patches when checking the result of perf_evsel open routines to see if a retry is needed to cope with older kernels where the exclude guest/host perf_event_attr bits were not used. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2012-05-18 13:13:33 -03:00
Robert Richter	5bcdf5e4fe	perf/x86: Update event scheduling constraints for AMD family 15h models This update is for newer family 15h cpu models from 0x02 to 0x1f. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: stable@vger.kernel.org # v2.6.39+ Link: http://lkml.kernel.org/r/1337337642-1621-1-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 14:07:46 +02:00
Michael S. Tsirkin	0ab711ae6a	x86/apic: Implement EIO micro-optimization We know both register and value for eoi beforehand, so there's no need to check it and no need to do math to calculate the msr. Saves instructions/branches on each EOI when using x2apic. I looked at the objdump output to verify that the generated code looks right and actually is shorter. The real improvemements will be on the KVM guest side though, those come in a later patch. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: gleb@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/e019d1a125316f10d3e3a4b2f6bda41473f4fb72.1337184153.git.mst@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 09:46:09 +02:00
Michael S. Tsirkin	2a43195d83	x86/apic: Add apic->eoi_write() callback Add eoi_write callback so that kvm can override eoi accesses without touching the rest of the apic. As a side-effect, this will enable a micro-optimization for apics using msr. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: gleb@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/0df425d746c49ac2ecc405174df87752869629d2.1337184153.git.mst@redhat.com [ tidied it up a bit ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 09:46:08 +02:00
Paul Gortmaker	bb8187d35f	MCA: delete all remaining traces of microchannel bus support. Hardware with MCA bus is limited to 386 and 486 class machines that are now 20+ years old and typically with less than 32MB of memory. A quick search on the internet, and you see that even the MCA hobbyist/enthusiast community has lost interest in the early 2000 era and never really even moved ahead from the 2.4 kernels to the 2.6 series. This deletes anything remaining related to CONFIG_MCA from core kernel code and from the x86 architecture. There is no point in carrying this any further into the future. One complication to watch for is inadvertently scooping up stuff relating to machine check, since there is overlap in the TLA name space (e.g. arch/x86/boot/mca.c). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: James Bottomley <JBottomley@Parallels.com> Cc: x86@kernel.org Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-05-17 19:06:13 -04:00
Linus Torvalds	31ae98359d	Merge branches 'perf-urgent-for-linus', 'x86-urgent-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf, x86 and scheduler updates from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tracing: Do not enable function event with enable perf stat: handle ENXIO error for perf_event_open perf: Turn off compiler warnings for flex and bison generated files perf stat: Fix case where guest/host monitoring is not supported by kernel perf build-id: Fix filename size calculation * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, kvm: KVM paravirt kernels don't check for CPUID being unavailable x86: Fix section annotation of acpi_map_cpu2node() x86/microcode: Ensure that module is only loaded on supported Intel CPUs * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption	2012-05-17 09:35:17 -07:00
Peter Zijlstra	8e7fbcbc22	sched: Remove stale power aware scheduling remnants and dysfunctional knobs It's been broken forever (i.e. it's not scheduling in a power aware fashion), as reported by Suresh and others sending patches, and nobody cares enough to fix it properly ... so remove it to make space free for something better. There's various problems with the code as it stands today, first and foremost the user interface which is bound to topology levels and has multiple values per level. This results in a state explosion which the administrator or distro needs to master and almost nobody does. Furthermore large configuration state spaces aren't good, it means the thing doesn't just work right because it's either under so many impossibe to meet constraints, or even if there's an achievable state workloads have to be aware of it precisely and can never meet it for dynamic workloads. So pushing this kind of decision to user-space was a bad idea even with a single knob - it's exponentially worse with knobs on every node of the topology. There is a proposal to replace the user interface with a single 3 state knob: sched_balance_policy := { performance, power, auto } where 'auto' would be the preferred default which looks at things like Battery/AC mode and possible cpufreq state or whatever the hw exposes to show us power use expectations - but there's been no progress on it in the past many months. Aside from that, the actual implementation of the various knobs is known to be broken. There have been sporadic attempts at fixing things but these always stop short of reaching a mergable state. Therefore this wholesale removal with the hopes of spurring people who care to come forward once again and work on a coherent replacement. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1326104915.2442.53.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-17 13:48:56 +02:00
Steven Rostedt	e4f5d5440b	ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() To remove duplicate code, have the ftrace arch_ftrace_update_code() use the generic ftrace_modify_all_code(). This requires that the default ftrace_replace_code() becomes a weak function so that an arch may override it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 20:00:27 -04:00
Suresh Siddha	1dcc8d7ba2	x86, fpu: drop the fpu state during thread exit There is no need to save any active fpu state to the task structure memory if the task is dead. Just drop the state instead. For example, this saved some 1770 xsave's during the system boot of a two socket Xeon system. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1336692811-30576-4-git-send-email-suresh.b.siddha@intel.com Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-16 15:20:59 -07:00
Suresh Siddha	d75f1b391f	x86, xsave: remove thread_has_fpu() bug check in __sanitize_i387_state() Code paths like fork(), exit() and signal handling flush the fpu state explicitly to the structures in memory. BUG_ON() in __sanitize_i387_state() is checking that the fpu state is not live any more. But for preempt kernels, task can be scheduled out and in at any place and the preload_fpu logic during context switch can make the fpu registers live again. For example, consider a 64-bit Task which uses fpu frequently and as such you will find its fpu_counter mostly non-zero. During its time slice, kernel used fpu by doing kernel_fpu_begin/kernel_fpu_end(). After this, in the same scheduling slice, task-A got a signal to handle. Then during the signal setup path we got preempted when we are just before the sanitize_i387_state() in arch/x86/kernel/xsave.c:save_i387_xstate(). And when we come back we will have the fpu registers live that can hit the bug_on. Similarly during core dump, other threads can context-switch in and out (because of spurious wakeups while waiting for the coredump to finish in kernel/exit.c:exit_mm()) and the main thread dumping core can run into this bug when it finds some other thread with its fpu registers live on some other cpu. So remove the paranoid check for now, even though it caught a bug in the multi-threaded core dump case (fixed in the previous patch). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1336692811-30576-3-git-send-email-suresh.b.siddha@intel.com Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-16 15:17:17 -07:00
Suresh Siddha	55ccf3fe3f	fork: move the real prepare_to_copy() users to arch_dup_task_struct() Historical prepare_to_copy() is mostly a no-op, duplicated for majority of the architectures and the rest following the x86 model of flushing the extended register state like fpu there. Remove it and use the arch_dup_task_struct() instead. Suggested-by: Oleg Nesterov <oleg@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1336692811-30576-1-git-send-email-suresh.b.siddha@intel.com Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Chris Zankel <chris@zankel.net> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-16 15:16:26 -07:00
Peter Jones	ab7b64e9ee	x86: Don't continue booting if we can't load the specified initrd If we've determined we can't do what the user asked, trying to do something else isn't going to make the user's life better. Without this the screen scrolls a bit and then you get a panic anyway, and it's nice not to have so much scroll after the real problem in bug reports. Link: http://lkml.kernel.org/r/1337190206-12121-1-git-send-email-pjones@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-16 13:59:52 -07:00
H. Peter Anvin	1371270188	x86, realmode: Move kernel/realmode.c to realmode/init.c Keep all the realmode code together, including initialization (only the rm/ subdirectory is actually built as real-mode code, anyway.) Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>	2012-05-16 13:49:10 -07:00
H. Peter Anvin	796038799a	x86, realmode: Mask out EFER.LMA when saving trampoline EFER Some AMD processors apparently #GP(0) if EFER.LMA is set in WRMSR, rather than ignoring it. Thus, we need to mask it out. Reported-by: Ingo Molnar <mingo@kernel.org> Tested-by: Borislav Petkov <bp@alien8.de> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-24-git-send-email-jarkko.sakkinen@intel.com	2012-05-16 13:22:41 -07:00
Shuah Khan	363f7ce325	x86: kernel/dumpstack.c simple_strtoul cleanup Change kstack_setup() and code_bytes_setup() in kernel/dumpstack.c to call kstrtoul() instead of calling obsoleted simple_strtoul(). Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Link: http://lkml.kernel.org/r/1336327084.2897.15.camel@lorien2 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-15 15:36:42 -07:00
Shuah Khan	5abe68e493	x86: kernel/check.c simple_strtoul cleanup Change set_corruption_check() and set_corruption_check_period() in kernel/check.c to call kstrtoul() instead of calling obsoleted simple_strtoul(). Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Link: http://lkml.kernel.org/r/1336326908.2897.12.camel@lorien2 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-15 15:36:41 -07:00
Tony Luck	dad1743e59	x86/mce: Only restart instruction after machine check recovery if it is safe Section 15.3.1.2 of the software developer manual has this to say about the RIPV bit in the IA32_MCG_STATUS register: RIPV (restart IP valid) flag, bit 0 — Indicates (when set) that program execution can be restarted reliably at the instruction pointed to by the instruction pointer pushed on the stack when the machine-check exception is generated. When clear, the program cannot be reliably restarted at the pushed instruction pointer. We need to save the state of this bit in do_machine_check() and use it in mce_notify_process() to force a signal; even if memory_failure() says it made a complete recovery ... e.g. replaced a clean LRU page. Acked-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-05-14 15:07:48 -07:00
Alex Shi	c6ae41e7d4	x86: replace percpu_xxx funcs with this_cpu_xxx Since percpu_xxx() serial functions are duplicated with this_cpu_xxx(). Removing percpu_xxx() definition and replacing them by this_cpu_xxx() in code. There is no function change in this patch, just preparation for later percpu_xxx serial function removing. On x86 machine the this_cpu_xxx() serial functions are same as __this_cpu_xxx() without no unnecessary premmpt enable/disable. Thanks for Stephen Rothwell, he found and fixed a i386 build error in the patch. Also thanks for Andrew Morton, he kept updating the patchset in Linus' tree. Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Christoph Lameter <cl@gentwo.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2012-05-14 14:15:31 -07:00
Peter Zijlstra	316ad24830	sched/x86: Rewrite set_cpu_sibling_map() Commit `ad7687dde` ("x86/numa: Check for nonsensical topologies on real hw as well") is broken in that the condition can trigger for valid setups but only changes the end result for invalid setups with no real means of discerning between those. Rewrite set_cpu_sibling_map() to make the code clearer and make sure to only warn when the check changes the end result. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-klcwahu3gx467uhfiqjyhdcs@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-14 15:05:25 +02:00
Ingo Molnar	9cba26e66d	Merge branch 'perf/uprobes' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/uprobes	2012-05-14 14:43:40 +02:00
Shai Fultheim	ead91d4b8c	x86/vsmp: Fix number of CPUs when vsmp is disabled In case CONFIG_X86_VSMP is not set, limit the number of CPUs to the number of CPUs of the first board. Also make CONFIG_X86_VSMP depend on CONFIG_SMP, as there's little point in having a vsmp machine with a single CPU. Signed-off-by: Shai Fultheim <shai@scalemp.com> [ido@wizery.com: rebased, fixed minor coding-style issues] Signed-off-by: Ido Yariv <ido@wizery.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-14 14:42:33 +02:00
Don Zickus	3aac27aba7	x86/reboot: Update nonmi_ipi parameter Update the nonmi_ipi parameter to reflect the simple change instead of the previous complicated one. There should be less of a need to use it but there may still be corner cases on older hardware that stumble into NMI issues. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1336761675-24296-4-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-14 11:49:38 +02:00
Don Zickus	7d007d21e5	x86/reboot: Use NMI to assist in shutting down if IRQ fails For v3.3, I added code to use the NMI to stop other cpus in the panic case. The idea was to make sure all cpus on the system were definitely halted to help serialize the panic path to execute the rest of the code on a single cpu. The main problem it was trying to solve was how to stop a cpu that was spinning with its irqs disabled. A IPI irq would be stuck and couldn't get in there, but an NMI could. Things were great until we had another conversation about some pstore changes. Because some of the backend pstore still uses spinlocks to protect the device access, things could get ugly if a panic happened and we were stuck spinning on a lock. Now with the NMI shutting down cpus, we could assume no other cpus were running and just bust the spin lock and proceed. The counter argument was, well if you do that the backend could be in a screwed up state and you might not be able to save anything as a result. If we could have just given the cpu a little more time to finish things, we could have grabbed the spin lock cleanly and everything would have been fine. Well, how do give a cpu a 'little more time' in the panic case? For the most part you can't without spinning on the lock and even in that case, how long do you spin for? So instead of making it ugly in the pstore code, just mimic the idea that stop_machine had, which is block on an IRQ IPI until the remote cpu has re-enabled interrupts and left the critical region. Which is what happens now using REBOOT_IRQ. Then leave the NMI case for those cpus that are truly stuck after a short time. This leaves the current behaviour alone and just handle a corner case. Most systems should never have to enter the NMI code and if they do, print out a message in case the NMI itself causes another issue. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1336761675-24296-3-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-14 11:49:37 +02:00
Don Zickus	5d2b86d90f	Revert "x86, reboot: Use NMI instead of REBOOT_VECTOR to stop cpus" This reverts commit `3603a2512f`. Originally I wanted a better hammer to shutdown cpus during panic. However, this really steps on the toes of various spinlocks in the panic path. Sometimes it is easier to wait for the IRQ to become re-enabled to indictate the cpu left the critical region and then shutdown the cpu. The next patch moves the NMI addition after the IRQ part. To make it easier to see the logic of everything, revert this patch and apply the next simpler patch. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1336761675-24296-2-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-14 11:49:37 +02:00
Linus Torvalds	63f4711aec	Merge git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Avi Kivity: "Two asynchronous page fault fixes (one guest, one host), a powerpc page refcount fix, and an ia64 build fix." * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: ia64: fix build due to typo KVM: PPC: Book3S HV: Fix refcounting of hugepages KVM: Do not take reference to mm during async #PF KVM: ensure async PF event wakes up vcpu from halt	2012-05-09 11:14:13 -07:00
Robert Richter	8b1e13638d	perf/x86-ibs: Fix usage of IBS op current count The value of IbsOpCurCnt rolls over when it reaches IbsOpMaxCnt. Thus, it is reset to zero by hardware. To get the correct count we need to add the max count to it in case we received an ibs sample (valid bit set). Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-13-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:17 +02:00
Robert Richter	fc5fb2b5e1	perf/x86-ibs: Catch spurious interrupts after stopping IBS After disabling IBS there could be still incomming NMIs with samples that even have the valid bit cleared. Mark all this NMIs as handled to avoid spurious interrupt messages. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-12-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:16 +02:00
Robert Richter	c9574fe0bd	perf/x86-ibs: Implement workaround for IBS erratum #420 When disabling ibs there might be the case where hardware continuously generates interrupts. This is described in erratum #420 (Instruction- Based Sampling Engine May Generate Interrupt that Cannot Be Cleared). To avoid this we must clear the counter mask first and then clear the enable bit. This patch implements this. See Revision Guide for AMD Family 10h Processors, Publication #41322. Note: We now keep track of the last read ibs config value which is then used to disable ibs. To update the config value we pass now a pointer to the functions reading it. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-11-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:16 +02:00
Robert Richter	7caaf4d824	perf/x86-ibs: Extend hw period that triggers overflow If the last hw period is too short we might hit the irq handler which biases the results. Thus try to have a max last period that triggers the sw overflow. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-10-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:15 +02:00
Robert Richter	fc006cf7cc	perf/x86-ibs: Trigger overflow if remaining period is too small There are cases where the remaining period is smaller than the minimal possible value. In this case the counter is restarted with the minimal period. This is of no use as the interrupt handler will trigger immediately again and most likely hits itself. This biases the results. So, if the remaining period is within the min range, we better do not restart the counter and instead trigger the overflow. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-9-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:15 +02:00
Robert Richter	98112d2e95	perf/x86-ibs: Rename some variables Simple patch that just renames some variables for better understanding. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-8-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:14 +02:00
Robert Richter	450bbd493d	perf/x86-ibs: Precise event sampling with IBS for AMD CPUs This patch adds support for precise event sampling with IBS. There are two counting modes to count either cycles or micro-ops. If the corresponding performance counter events (hw events) are setup with the precise flag set, the request is redirected to the ibs pmu: perf record -a -e cpu-cycles:p ... # use ibs op counting cycle count perf record -a -e r076:p ... # same as -e cpu-cycles:p perf record -a -e r0C1:p ... # use ibs op counting micro-ops Each ibs sample contains a linear address that points to the instruction that was causing the sample to trigger. With ibs we have skid 0. Thus, ibs supports precise levels 1 and 2. Samples are marked with the PERF_EFLAGS_EXACT flag set. In rare cases the rip is invalid when IBS was not able to record the rip correctly. Then the PERF_EFLAGS_EXACT flag is cleared and the rip is taken from pt_regs. V2: * don't drop samples in precise level 2 if rip is invalid, instead support the PERF_EFLAGS_EXACT flag Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120502103309.GP18810@erda.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:14 +02:00
Robert Richter	d47e8238cd	perf/x86-ibs: Take instruction pointer from ibs sample Each IBS sample contains a linear address of the instruction that caused the sample to trigger. This address is more precise than the rip that was taken from the interrupt handler's stack. Update the rip with that address. We use this in the next patch to implement precise-event sampling on AMD systems using IBS. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-6-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:13 +02:00
Robert Richter	6accb9cf76	perf/x86-ibs: Fix frequency profiling Fixing profiling at a fixed frequency, in this case the freq value and sample period was setup incorrectly. Since sampling periods are adjusted we also allow periods that have lower 4 bits set. Another fix is the setup of the hw counter: If we modify hwc->sample_period, we also need to update hwc->last_period and hwc->period_left. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-5-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:13 +02:00
Robert Richter	7bf352384f	perf/x86-ibs: Enable ibs op micro-ops counting mode Allow enabling ibs op micro-ops counting mode. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-4-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:12 +02:00
Robert Richter	fd0d000b2c	perf: Pass last sampling period to perf_sample_data_init() We always need to pass the last sample period to perf_sample_data_init(), otherwise the event distribution will be wrong. Thus, modifiyng the function interface with the required period as argument. So basically a pattern like this: perf_sample_data_init(&data, ~0ULL); data.period = event->hw.last_period; will now be like that: perf_sample_data_init(&data, ~0ULL, event->hw.last_period); Avoids unininitialized data.period and simplifies code. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:12 +02:00
Robert Richter	c75841a398	perf/x86-ibs: Fix update of period The last sw period was not correctly updated on overflow and thus led to wrong distribution of events. We always need to properly initialize data.period in struct perf_sample_data. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-2-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:11 +02:00
Ingo Molnar	ad8537cda6	Merge branch 'perf/x86-ibs' into perf/core	2012-05-09 15:22:23 +02:00
Ingo Molnar	ad7687dde8	x86/numa: Check for nonsensical topologies on real hw as well Instead of only checking nonsensical topologies on numa-emu, do it on real hardware as well, and print a warning. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/n/tip-re15l0jqjtpz709oxozt2zoh@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 13:32:35 +02:00
Peter Zijlstra	0acbb440f0	x86/numa: Hard partition cpu topology masks on node boundaries When using numa=fake= you can get weird topologies where LLCs can span nodes and other such nonsense. Cure this by hard partitioning these masks on node boundaries. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/n/tip-di5vwjm96q5vrb76opwuflwx@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 13:28:59 +02:00
Jan Beulich	57da8b960b	x86: Avoid double stack traces with show_regs() What was called show_registers() so far already showed a stack trace for kernel faults, and kernel_stack_pointer() isn't even valid to be used for faults from user mode, hence it was pointless for show_regs() to call show_trace() after show_registers(). Simply rename show_registers() to show_regs() and eliminate the old definition. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/4FAA3D3902000078000826E1@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 11:44:42 +02:00
Jarkko Sakkinen	cda846f101	x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline This patch changes 64-bit trampoline so that CR4 and EFER are provided by the kernel instead of using fixed values. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-24-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 15:04:27 -07:00
Jarkko Sakkinen	bf8b88e977	x86, realmode: fixes compilation issue in tboot.c Fixed include path of wakeup.h in tboot.c. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-23-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 15:04:27 -07:00
Jarkko Sakkinen	f37240f16b	x86, realmode: header for trampoline code Added header for trampoline code that can be used to supply input data to it. This makes interface between real mode code and kernel cleaner and simpler. Replaced two confusing pointers to level4 pgt in trampoline_64.S with a single pointer to the beginning of the page table. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-21-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 11:48:45 -07:00
Jarkko Sakkinen	c4845474a0	x86, realmode: flattened rm hierachy Simplified hierarchy under rm directory to a flat directory because it is not anymore really justified to have own directory for wakeup code. It only adds more complexity. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-20-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 11:48:45 -07:00
Jarkko Sakkinen	b429dbf6e8	x86, realmode: don't copy real_mode_header Replaced copying of real_mode_header with a pointer to beginning of RM memory. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-19-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 11:48:45 -07:00
Jarkko Sakkinen	8e029fcdd8	x86, realmode: fix 64-bit wakeup sequence There were number of issues in wakeup sequence: - Wakeup stack was placed in hardcoded address. - NX bit in EFER was not enabled. - Initialization incorrectly set physical address of secondary_startup_64. - Some alignment issues. This patch fixes these issues and in addition: - Unifies coding conventions in .S files. - Sets alignments of code and data right. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-18-git-send-email-jarkko.sakkinen@intel.com Originally-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Len Brown <len.brown@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 11:48:11 -07:00
Jarkko Sakkinen	f156ffc439	x86, realmode: Set permission for real mode pages Set proper permissions for rodata, text and data, removing the realmode trampoline area as a remaining RWX memory mapping in the kernel. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-8-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 11:47:08 -07:00
Jarkko Sakkinen	c9b77ccb52	x86, realmode: Move ACPI wakeup to unified realmode code Migrated ACPI wakeup code to the real-mode blob. Code existing in .x86_trampoline can be completely removed. Static descriptor table in wakeup_asm.S is courtesy of H. Peter Anvin. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-7-git-send-email-jarkko.sakkinen@intel.com Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Len Brown <len.brown@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 11:46:05 -07:00
Jarkko Sakkinen	48927bbb97	x86, realmode: Move SMP trampoline to unified realmode code Migrated SMP trampoline code to the real mode blob. SMP trampoline code is not yet removed from .x86_trampoline because it is needed by the wakeup code. [ hpa: always enable compiling startup_32_smp in head_32.S... it is only a few instructions which go into .init on UP builds, and it makes the rest of the code less #ifdef ugly. ] Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-6-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 11:41:51 -07:00
Jarkko Sakkinen	5a8c9aebe0	x86, realmode: Move reboot_32.S to unified realmode code Migrated reboot_32.S from x86_trampoline to the real-mode blob. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-5-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 11:41:50 -07:00
Jarkko Sakkinen	084ee1c641	x86, realmode: Relocator for realmode code Implements relocator for real mode code that is called as part of setup_arch(). Processes segment relocations and linear relocations. Real-mode code is relocated to a free hole below 1 MB. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-4-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 11:41:49 -07:00
Linus Torvalds	e9b19cd43f	Merge branch 'for-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull two percpu fixes from Tejun Heo: "One adds missing KERN_CONT on split printk()s and the other makes the percpu allocator avoid using PMD_SIZE as atom_size on x86_32. Using PMD_SIZE led to vmalloc area exhaustion on certain configurations (x86_32 android) and the only cost of using PAGE_SIZE instead is static percpu area not being aligned to large page mapping." * 'for-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bit percpu: use KERN_CONT in pcpu_dump_alloc_info()	2012-05-08 11:06:07 -07:00
Tejun Heo	d5e28005a1	percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bit With the embed percpu first chunk allocator, x86 uses either PAGE_SIZE or PMD_SIZE for atom_size. PMD_SIZE is used when CPU supports PSE so that percpu areas are aligned to PMD mappings and possibly allow using PMD mappings in vmalloc areas in the future. Using larger atom_size doesn't waste actual memory; however, it does require larger vmalloc space allocation later on for !first chunks. With reasonably sized vmalloc area, PMD_SIZE shouldn't be a problem but x86_32 at this point is anything but reasonable in terms of address space and using larger atom_size reportedly leads to frequent percpu allocation failures on certain setups. As there is no reason to not use PMD_SIZE on x86_64 as vmalloc space is aplenty and most x86_64 configurations support PSE, fix the issue by always using PMD_SIZE on x86_64 and PAGE_SIZE on x86_32. v2: drop cpu_has_pse test and make x86_64 always use PMD_SIZE and x86_32 PAGE_SIZE as suggested by hpa. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Yanmin Zhang <yanmin.zhang@intel.com> Reported-by: ShuoX Liu <shuox.liu@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <4F97BA98.6010001@intel.com> Cc: stable@vger.kernel.org	2012-05-08 09:42:18 -07:00
Jan Beulich	b2a3477727	x86: Fix section annotation of acpi_map_cpu2node() Commit `943bc7e110` ("x86: Fix section warnings") added __cpuinitdata here, while for functions __cpuinit should be used. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <sp@numascale.com> Link: http://lkml.kernel.org/r/4FA947910200007800082470@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Steffen Persvold <sp@numascale.com>	2012-05-08 16:40:26 +02:00
Thomas Gleixner	38e7c572ce	x86: Use common threadinfo allocator The only difference is the free_thread_info function, which frees xstate. Use the new arch_release_task_struct() function instead and switch over to the core allocator. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20120505150141.559556763@linutronix.de Cc: x86@kernel.org	2012-05-08 14:08:44 +02:00
Thomas Gleixner	67ba5293f7	Merge branch 'smp/threadalloc' into smp/hotplug Reason: Pull in the separate branch which was created so arch/tile can base further work on it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-05-08 14:07:48 +02:00
Thomas Gleixner	85f7f65627	x86: Use kick_all_cpus_sync() Use kick_all_cpus_sync() and remove cpu_idle_wait(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120507175652.190382227@linutronix.de Cc: x86@kernel.org	2012-05-08 12:35:06 +02:00
Márton Németh	d1ecad6eee	x86/apic: Only compile local function if used with !CONFIG_GENERIC_PENDING_IRQ The local function io_apic_level_ack_pending() is only called from io_apic_level_ack_pending(). The later function is only compiled if CONFIG_GENERIC_PENDING_IRQ is defined. Move the io_apic_level_ack_pending() to the existing #ifdef CONFIG_GENERIC_PENDING_IRQ code block. This will remove the following warning message during compiling without CONFIG_GENERIC_PENDING_IRQ defined: * arch/x86/kernel/apic/io_apic.c:382: warning: ‘io_apic_level_ack_pending’ defined but not used Signed-off-by: Márton Németh <nm127@freemail.hu> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Link: http://lkml.kernel.org/r/1336461860.2296.3.camel@sbsiddha-mobl2 Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-08 11:23:15 +02:00
Shuah Khan	e826abd523	x86, microcode: microcode_core.c simple_strtoul cleanup Change reload_for_cpu() in kernel/microcode_core.c to call kstrtoul() instead of calling obsoleted simple_strtoul(). Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Reviewed-by: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1336324264.2897.9.camel@lorien2 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-07 11:36:49 -07:00
Ingo Molnar	9438ef7f4e	x86/apic: Fix UP boot crash Commit `31b3c9d723` ("xen/x86: Implement x86_apic_ops") implemented this: ... without considering that on UP the function pointer might be NULL. Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/n/tip-3pfty0ml4yp62phbkchichh0@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 19:19:56 +02:00
Srivatsa S. Bhat	19209bbb86	x86/sched: Make mwait_usable() heed to "idle=" kernel parameters properly The checks that exist in mwait_usable() for "idle=" kernel parameters are insufficient. As a result, mwait_usable() can return 1 even if "idle=nomwait" or "idle=poll" or "idle=halt" parameters are passed. Of these cases, incorrect handling of idle=nomwait is a universal problem since mwait can get used for usual CPU idling. However the rest of the cases are problematic only during CPU Hotplug (offline) because, in the CPU offline path, the function mwait_play_dead() is called, which might result in mwait being used in the offline CPUs, if mwait_usable() happens to return 1. Fix these issues by checking for the boot time "idle=" kernel parameter properly in mwait_usable(). The first issue (usual cpu idling) is demonstrated below: Before applying the patch (dmesg snippet): [ 0.000000] Command line: [...] idle=nomwait [ 0.000000] Kernel command line: [...] idle=nomwait [ 0.000000] RCU dyntick-idle grace-period acceleration is enabled. [ 0.140606] using mwait in idle threads. <======= mwait being used [ 4.303986] cpuidle: using governor ladder [ 4.308232] cpuidle: using governor menu After applying the patch: [ 0.000000] Command line: [...] idle=nomwait [ 0.000000] Kernel command line: [...] idle=nomwait [ 0.000000] RCU dyntick-idle grace-period acceleration is enabled. [ 4.264100] cpuidle: using governor ladder [ 4.268342] cpuidle: using governor menu Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: venki@google.com Cc: suresh.b.siddha@intel.com Cc: Borislav Petkov <bp@amd64.org> Cc: lenb@kernel.org Cc: Rafael J. Wysocki <rjw@sisk.pl> Link: http://lkml.kernel.org/r/4F9E37B8.30001@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 16:27:20 +02:00
Shai Fultheim	42fa425043	x86: Conditionally update time when ack-ing pending irqs On virtual environments, apic_read could take a long time. As a result, under certain conditions the ack pending loop may exit without any queued irqs left, but after more than one second. A warning will be printed needlessly in this case. If the loop is about to exit regardless of max_loops, don't update it. Signed-off-by: Shai Fultheim <shai@scalemp.com> [ rebased and reworded the commit message] Signed-off-by: Ido Yariv <ido@wizery.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1334873552-31346-1-git-send-email-ido@wizery.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 16:25:28 +02:00
Ingo Molnar	79fec2c557	Interrupt remapping ops for x86 This patchset introduces a generic ops-interface for accessing interrupt remapping hardware on x86. It factors out the VT-d specific code from io_apic.c and moves it to drivers/iommu. These changes will be used to add support for AMD interrupt remapping hardware. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPp8ZZAAoJECvwRC2XARrj0o0P/22YaTNWvl1ZGDnO2Z913pEM hbE0RPsht/2paCR+oXggu3vv3D0BzY5SkC0Hmjr4po4hiwDNXHj+V1QQKHFEMwuO gzWatXCW/dDvIbDNHuCgG76n0kNmDMeps/MgKGrG4seV7zQ2jMfjaWFvJ2UKF/fJ D1Jm+7QLr+MLM/cT64WG5S/8/Pi1N6CYbdXwkPTlFTnKRS+NODWGMo9fgc7VIsI/ 9A1haZp+aV/TvGqklDlSh6CgIUBZVt/cffpftJiuH4L3o5mq1sXaFcK4Whu8PxGD OwmbLe4PKXl39piPe4x8NGGrj0xnk+WN17JUQpefxFIbBkCjfGakNqLxFMKbxH8k AtWeERge3Jd4jcfbmlXQ0GKaL2+BQUzIq0VHadHwsifCKs0KP7qG+q4Ev+02+Xbo gfC5UtemoffI4fXznvNIE0Hcp8EYONmtksPJ2iKOxy+t7uDlESDMDspXf1IOLZzO BODOdDlnsPCCWxT9RsJziVkD6C/hSZRwQIavZchBeMele4+YMpym+POlTJ5lx9KJ TVXYZmRv8sUgzP67s0WuDNlB2SEQq8fHNYIHrCPGqcmp/VHCFzj1nSe5jRu3FdnB Yu44BOoTibYss/aVy21hbB99SRMkpUWfImhH2eZLO/dahtlUMiI9ZnnYBJ62Ol7u bBAnLHtq8WkVZH6drSCe =H+Nt -----END PGP SIGNATURE----- Merge tag 'intr-remapping-ops-for-ingo' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into core/iommu - This patchset introduces a generic ops-interface for accessing interrupt remapping hardware on x86. It factors out the VT-d specific code from io_apic.c and moves it to drivers/iommu. These changes will be used to add support for AMD interrupt remapping hardware. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 16:21:35 +02:00
Shai Fultheim	ddc5681ed3	x86/cache_info: Fix setup of l2/l3 ids On some architectures (such as vSMP), it is possible to have CPUs with a different number of cores sharing the same cache. The current implementation implicitly assumes that all CPUs will have the same number of cores sharing caches, and as a result, different CPUs can end up with the same l2/l3 ids. Fix this by masking out the shared cache bits, instead of shifting the APICID. By doing so, it is guaranteed that the generated cache ids are always unique. Signed-off-by: Shai Fultheim <shai@scalemp.com> [ rebased, simplified, and reworded the commit message] Signed-off-by: Ido Yariv <ido@wizery.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Mike Travis <travis@sgi.com> Cc: Dave Jones <davej@redhat.com> Link: http://lkml.kernel.org/r/1334873351-31142-1-git-send-email-ido@wizery.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 15:27:37 +02:00
Jan Beulich	b2d6aba965	x86: Allow multiple values to be specified with "hpet=" This is particularly to be able to specify "hpet=force,verbose", as "force" ought to be a primary candidate for also wanting to use "verbose". Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/4F79D120020000780007C031@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 15:06:28 +02:00
Jan Beulich	396e2c6fed	x86: Clear HPET configuration registers on startup While Linux itself has been calling hpet_disable() for quite a while, having e.g. a secondary (kexec) kernel depend on such behavior of the primary (crashed) environment is fragile. It particularly broke until very recently when the primary environment was Xen based, as that hypervisor did not clear any of the HPET settings it may have used. Rather than blindly (and incompletely) clearing certain HPET settings in hpet_disable(), latch the config register settings during boot and restore then here. (Note on the hpet_set_mode() change: Now that we're clearing the level bit upon initialization, there's no need anymore to do so here.) Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/4F79D0BB020000780007C02D@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 15:06:27 +02:00
Srivatsa S. Bhat	7164b3f5e5	x86/microcode: Ensure that module is only loaded on supported Intel CPUs Exit early when there's no support for a particular CPU family. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Jones <davej@redhat.com> Cc: tigran@aivazian.fsnet.co.uk Cc: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/4F8BDB58.6070007@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-07 14:37:14 +02:00
Suresh Siddha	8a8f422d3b	iommu: rename intr_remapping.[ch] to irq_remapping.[ch] Make the file names consistent with the naming conventions of irq subsystem. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2012-05-07 14:35:00 +02:00
Suresh Siddha	95a02e976c	iommu: rename intr_remapping references to irq_remapping Make the code consistent with the naming conventions of irq subsystem. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2012-05-07 14:35:00 +02:00
Joerg Roedel	263b5e8629	x86, iommu/vt-d: Clean up interfaces for interrupt remapping Remove the Intel specific interfaces from dmar.h and remove asm/irq_remapping.h which is only used for io_apic.c anyway. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2012-05-07 14:35:00 +02:00
Joerg Roedel	5e2b930b07	iommu/vt-d: Convert MSI remapping setup to remap_ops This patch introduces remapping-ops for setting ups MSI interrupts. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2012-05-07 14:35:00 +02:00
Joerg Roedel	9d619f6572	iommu/vt-d: Convert free_irte into a remap_ops callback The operation for releasing a remapping entry is iommu specific too. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2012-05-07 14:34:59 +02:00
Joerg Roedel	4c1bad6a0a	iommu/vt-d: Convert IR set_affinity function to remap_ops The function to set interrupt affinity with interrupt remapping enabled is Intel specific too. So move it to the irq_remap_ops too. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2012-05-07 14:34:59 +02:00
Joerg Roedel	0c3f173a88	iommu/vt-d: Convert IR ioapic-setup to use remap_ops The IOAPIC setup routine for interrupt remapping is VT-d specific. Move it to the irq_remap_ops and add a call helper function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2012-05-07 14:34:59 +02:00
Joerg Roedel	4f3d8b67ad	iommu/vt-d: Convert missing apic.c intr-remapping call to remap_ops Convert these calls too: * Disable of remapping hardware * Reenable of remapping hardware * Enable fault handling With that all of arch/x86/kernel/apic/apic.c is converted to use the generic intr-remapping interface. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2012-05-07 14:34:59 +02:00
Joerg Roedel	736baef447	iommu/vt-d: Make intr-remapping initialization generic This patch introduces irq_remap_ops to hold implementation specific function pointer to handle interrupt remapping. As the first part the initialization functions for VT-d are converted to these ops. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2012-05-07 14:34:59 +02:00
Ingo Molnar	22042c086c	Merge branch 'stable/for-ingo-v3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into x86/apic	2012-05-07 12:07:51 +02:00
Ingo Molnar	19631cb3d6	Merge branch 'tip/perf/core-4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core	2012-05-07 11:03:52 +02:00
Larry Finger	febb72a6e4	IA32 emulation: Fix build problem for modular ia32 a.out support Commit `ce7e5d2d19` ("x86: fix broken TASK_SIZE for ia32_aout") breaks kernel builds when "CONFIG_IA32_AOUT=m" with ERROR: "set_personality_ia32" [arch/x86/ia32/ia32_aout.ko] undefined! make[1]: *** [__modpost] Error 1 The entry point needs to be exported. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Acked-by: Al Viro <viro@zeniv.linux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-06 18:26:20 -07:00
Gleb Natapov	62c49cc976	KVM: Do not take reference to mm during async #PF It turned to be totally unneeded. The reason the code was introduced is so that KVM can prefault swapped in page, but prefault can fail even if mm is pinned since page table can change anyway. KVM handles this situation correctly though and does not inject spurious page faults. Fixes: "INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected" warning while running LTP inside a KVM guest using the recent -next kernel. Reported-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-05-06 15:00:02 +03:00
Thomas Gleixner	45046892ef	x86: Use generic init_task Same code. Use the generic version. The special Makefile treatment is pointless anyway as init_task.o contains only data which is handled by the linker script. So no point on being treated like head text. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20120503085035.739963562@linutronix.de Cc: x86@kernel.org	2012-05-05 13:00:26 +02:00
Steven Rostedt	59a094c994	ftrace/x86: Use asm/kprobes.h instead of linux/kprobes.h If CONFIG_KPROBES is not set, then linux/kprobes.h will not include asm/kprobes.h needed by x86/ftrace.c for the BREAKPOINT macro. The x86/ftrace.c file should just include asm/kprobes.h as it does not need the rest of kprobes. Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-04 09:28:29 -04:00
James Morris	898bfc1d46	Linux 3.4-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJPnb50AAoJEHm+PkMAQRiGAE0H/A4zFZIUGmF3miKPDYmejmrZ oVDYxVAu6JHjHWhu8E3VsinvyVscowjV8dr15eSaQzmDmRkSHAnUQ+dB7Di7jLC2 MNopxsWjwyZ8zvvr3rFR76kjbWKk/1GYytnf7GPZLbJQzd51om2V/TY/6qkwiDSX U8Tt7ihSgHAezefqEmWp2X/1pxDCEt+VFyn9vWpkhgdfM1iuzF39MbxSZAgqDQ/9 JJrBHFXhArqJguhENwL7OdDzkYqkdzlGtS0xgeY7qio2CzSXxZXK4svT6FFGA8Za xlAaIvzslDniv3vR2ZKd6wzUwFHuynX222hNim3QMaYdXm012M+Nn1ufKYGFxI0= =4d4w -----END PGP SIGNATURE----- Merge tag 'v3.4-rc5' into next Linux 3.4-rc5 Merge to pull in prerequisite change for Smack: `86812bb0de` Requested by Casey.	2012-05-04 12:46:40 +10:00
Konrad Rzeszutek Wilk	4a8e2a3115	x86/apic: Replace io_apic_ops with x86_io_apic_ops. Which makes the code fit within the rest of the x86_ops functions. Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> [v1: Changed x86_apic -> x86_ioapic per Yinghai Lu <yinghai@kernel.org> suggestion] [v2: Rebased on tip/x86/urgent and redid to match Ingo's syntax style] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-05-01 14:50:09 -04:00
Borislav Petkov	575203b474	x86, MCE, AMD: Disable error thresholding bank 4 on some models Turn off MC4_MISC thresholding banks on models which have them but that particular processor implementation does not supply applicable error sources to be counted. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-04-30 13:22:54 +02:00
Borislav Petkov	d26ecc4894	x86, MCE, AMD: Hide interrupt_enable sysfs node Depending on whether the box supports the APIC LVT interrupt for thresholding, we want to show the 'interrupt_enable' sysfs node or not. Make that the case by adding it to the default sysfs attributes only if it is supported. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-04-30 13:22:44 +02:00
Borislav Petkov	f227d4306c	x86, MCE, AMD: Make APIC LVT thresholding interrupt optional Currently, the APIC LVT interrupt for error thresholding is implicitly enabled. However, there are models in the F15h range which do not enable it. Make the code machinery which sets up the APIC interrupt support an optional setting and add an ->interrupt_capable member to the bank representation mirroring that capability and enable the interrupt offset programming only if it is true. Simplify code and fixup comment style while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-04-30 13:22:22 +02:00
Steven Rostedt	4a6d70c950	ftrace/x86: Remove the complex ftrace NMI handling code As ftrace function tracing would require modifying code that could be executed in NMI context, which is not stopped with stop_machine(), ftrace had to do a complex algorithm with various stages of setup and memory barriers to make it work. With the new breakpoint method, this is no longer required. The changes to the code can be done without any problem in NMI context, as well as without stop machine altogether. Remove the complex code as it is no longer needed. Also, a lot of the notrace annotations could be removed from the NMI code as it is now safe to trace them. With the exception of do_nmi itself, which does some special work to handle running in the debug stack. The breakpoint method can cause NMIs to double nest the debug stack if it's not setup properly, and that is done in do_nmi(), thus that function must not be traced. (Note the arch sh may want to do the same) Cc: Paul Mundt <lethal@linux-sh.org> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-04-27 21:11:28 -04:00
Steven Rostedt	08d636b6d4	ftrace/x86: Have arch x86_64 use breakpoints instead of stop machine This method changes x86 to add a breakpoint to the mcount locations instead of calling stop machine. Now that iret can be handled by NMIs, we perform the following to update code: 1) Add a breakpoint to all locations that will be modified 2) Sync all cores 3) Update all locations to be either a nop or call (except breakpoint op) 4) Sync all cores 5) Remove the breakpoint with the new code. 6) Sync all cores [ Added updates that Masami suggested: Use unlikely(modifying_ftrace_code) in int3 trap to keep kprobes efficient. Don't use NOTIFY_* in ftrace handler in int3 as it is not a notifier. ] Cc: H. Peter Anvin <hpa@zytor.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-04-27 21:10:44 -04:00
Andreas Herrmann	f7f286a910	x86/amd: Re-enable CPU topology extensions in case BIOS has disabled it BIOS will switch off the corresponding feature flag on family 15h models 10h-1fh non-desktop CPUs. The topology extension CPUID leafs are required to detect which cores belong to the same compute unit. (thread siblings mask is set accordingly and also correct information about L1i and L2 cache sharing depends on this). W/o this patch we wouldn't see which cores belong to the same compute unit and also cache sharing information for L1i and L2 would be incorrect on such systems. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-27 16:43:09 +02:00
Robert Richter	5f09fc6889	perf/x86: Fix cmpxchg() usage in amd_put_event_constraints() Now the return value of cmpxchg() is used to match an event. The change removes the duplicate event comparison and traverses the list until an event was removed. This also fixes the following warning: arch/x86/kernel/cpu/perf_event_amd.c:170: warning: value computed is not used Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333643084-26776-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-26 13:52:51 +02:00
Robert Richter	10c250234c	perf: Trivial cleanup of duplicate code Removing duplicate code. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333643084-26776-2-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-26 13:52:50 +02:00
Thomas Gleixner	7eb43a6d23	x86: Use generic idle thread allocation Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20120420124557.246929343@linutronix.de	2012-04-26 12:06:10 +02:00
Thomas Gleixner	5cdaf1834f	x86: Add task_struct argument to smp_ops.cpu_up Preparatory patch to use the generic idle thread allocation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20120420124557.176604405@linutronix.de	2012-04-26 12:06:10 +02:00
Ingo Molnar	fab06992de	perf/x86: Clean up register_nmi_handler() usage A function name represents the pointer to it - no need to take the address of it. (Fixing this helps us introduce some macro magic around register_nmi_handler() in the future.) Cc: Robert Richter <robert.richter@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-25 12:56:26 +02:00
Greg Pearson	ea0dcf903e	x86/apic: Use x2apic physical mode based on FADT setting Provide systems that do not support x2apic cluster mode a mechanism to select x2apic physical mode using the FADT FORCE_APIC_PHYSICAL_DESTINATION_MODE bit. Changes from v1: (based on Suresh's comments) - removed #ifdef CONFIG_ACPI - removed #include <linux/acpi.h> Signed-off-by: Greg Pearson <greg.pearson@hp.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1335313436-32020-1-git-send-email-greg.pearson@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-25 12:47:08 +02:00
Li Zhong	72b3fb2471	x86/nmi: Fix page faults by nmiaction if kmemcheck is enabled This patch tries to fix the problem of page fault exception caused by accessing nmiaction structure in nmi if kmemcheck is enabled. If kmemcheck is enabled, the memory allocated through slab are in pages that are marked non-present, so that some checks could be done in the page fault handling code ( e.g. whether the memory is read before written to ). As nmiaction is allocated in this way, so it resides in a non-present page. Then there is a page fault while the nmi code accessing the nmiaction structure, which would then cause a warning by WARN_ON_ONCE(in_nmi()) in kmemcheck_fault(), called by do_page_fault(). This significantly simplifies the code as well, as the whole dynamic allocation dance goes away. v2: as Peter suggested, changed the nmiaction to use static storage. v3: as Peter suggested, use macro to shorten the codes. Also keep the original usage of register_nmi_handler, so users of this call doesn't need change. Tested-by: Seiji Aguchi <seiji.aguchi@hds.com> Fixes: https://lkml.org/lkml/2012/3/2/356 Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> [ simplified the wrappers ] Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: thomas.mingarelli@hp.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1333051877-15755-4-git-send-email-dzickus@redhat.com [ tidied the patch a bit ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-25 12:44:06 +02:00
Don Zickus	553222f3e8	x86/nmi: Add new NMI queues to deal with IO_CHK and SERR In discussions with Thomas Mingarelli about hpwdt, he explained to me some issues they were some when using their virtual NMI button to test the hpwdt driver. It turns out the virtual NMI button used on HP's machines do no send unknown NMIs but instead send IO_CHK NMIs. The way the kernel code is written, the hpwdt driver can not register itself against that type of NMI and therefore can not successfully capture system information before panic'ing. To solve this I created two new NMI queues to allow driver to register against the IO_CHK and SERR NMIs. Or in the hpwdt all three (if you include unknown NMIs too). The change is straightforward and just mimics what the unknown NMI does. Reported-and-tested-by: Thomas Mingarelli <thomas.mingarelli@hp.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1333051877-15755-3-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-25 12:43:34 +02:00
Ingo Molnar	cd32b1616b	A small L3 cache index disable fix from Srivatsa Bhat which unifies the way the code checks for already disabled indices. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJPkEFNAAoJEBLB8Bhh3lVKEigP/1hrNM0oE3IQHejNYkZ2kbj0 DL4WkALiHCgWJVcr/lk9PvTaY7aIAo8HFgvbUKF4PcBvnFZvu6rckJPIsWgsE1wD +OHDaO27vyYKMaA9ImvqYneB8hcl528EDlZ7ssfTepQXPyx99SoTYc9ToT1+znVp qQUC+d67MTLl/eHL+i6rbQfYdVaXGaVTuAIpzQn3oTqUDLrmXKe+60oTgnC0zgSW kQ63Vo8MHi9CpzCe0JhZwvK3d8sPzrBktNisaEbdHe+0Gk5zvcEiPzE2nIyvLXjz cf0QoQFQHzniCdFQoYjzLafT3aItBsN1v29gOrydPT6LxMZ8wO5k+8Suu1NcyI9R RLJR61wKwSzi4JQ/1+LAqoDQHrldATKPCM74BLYiNTi8OGeqda+10COJQmLFITzM 9mn9fg/ZPg8V+z+0e2zObYcFVRtUCIs+XRWIzjhuPICdRRnwoNT+b9HyrLZ5JY1X BH3nHon0W4Bm5jEBhOb02XLdzBMtF3vRM4mNYCzpoS/tDFRszyOuV63FXkPurVbe hT9ZKhDZ63YV8ycwx2jqZgZAFuySi719kG3aUMDNMJYbUEi6D0QRKH9ngE6JAvwH rw1hX65sNX9jbBOAvcDTq2780SpV3dpO1TywLip9uSWIk6DYyEVHbr8AdWwnLlfb fdq9y8V/3+NC9aTNf/e3 =VwUL -----END PGP SIGNATURE----- Merge tag 'l3-fix-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent A small L3 cache index disable fix from Srivatsa Bhat which unifies the way the code checks for already disabled indices. ( Pulling it into v3.4 despite the v3.5 tag - the fix is small and we better keep the same code across kernel versions for such user facing interfaces. ) Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-25 12:24:16 +02:00
Konrad Rzeszutek Wilk	cd74257b97	x86, acpi: Call acpi_enter_sleep_state via an asmlinkage C function from assembler With commit `a2ef5c4fd4` "ACPI: Move module parameter gts and bfs to sleep.c" the wake_sleep_flags is required when calling acpi_enter_sleep_state. The assembler code in wakeup_*.S did not do that. One solution is to call it from assembler and stick the wake_sleep_flags on the stack (for 32-bit) or in %esi (for 64-bit). hpa and rafael both suggested however to create a wrapper function to call acpi_enter_sleep_state and call said wrapper function ("acpi_enter_s3") from assembler. For 32-bit, the acpi_enter_s3 ends up looking as so: push %ebp mov %esp,%ebp sub $0x8,%esp movzbl 0xc1809314,%eax [wake_sleep_flags] movl $0x3,(%esp) mov %eax,0x4(%esp) call 0xc12d1fa0 <acpi_enter_sleep_state> leave ret And 64-bit: movzbl 0x9afde1(%rip),%esi [wake_sleep_flags] push %rbp mov $0x3,%edi mov %rsp,%rbp callq 0xffffffff812e9800 <acpi_enter_sleep_state> leaveq retq Reviewed-by: H. Peter Anvin <hpa@zytor.com> Suggested-by: H. Peter Anvin <hpa@zytor.com> [v2: Remove extra assembler operations, per hpa review] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1335150198-21899-3-git-send-email-konrad.wilk@oracle.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-04-23 13:29:18 -07:00
Konrad Rzeszutek Wilk	2a14e541ed	ACPI: Convert wake_sleep_flags to a value instead of function With commit `a2ef5c4fd4` "ACPI: Move module parameter gts and bfs to sleep.c" the wake_sleep_flags is required when calling acpi_enter_sleep_state, which means that if there are functions outside the sleep.c code they can't get the wake_sleep_flags values. This converts the function in to a exported value and converts the module config operands to a function. Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Lin Ming <ming.m.lin@intel.com> [v2: Parameters can be turned on/off dynamically] [v3: unsigned char -> u8] [v4: val -> kp->arg] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1335150198-21899-2-git-send-email-konrad.wilk@oracle.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-04-23 13:29:07 -07:00
Chen Gong	8571723a69	x86/mce Add validation check before GHES error is recorded When GHES error record is logged into mcelog kernel buffer, a validation check for physical address is necessary, which prevents reporting an invalid physical address. [Since physical address is the only useful element in this error record, we drop generating the record completely if we don't have a valid address] Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-04-20 16:02:05 -07:00
H. Peter Anvin	5d6f8d77ed	x86, extable: Remove open-coded exception table entries in arch/x86/kernel/test_rodata.c Remove open-coded exception table entries in arch/x86/kernel/test_rodata.c, and replace them with _ASM_EXTABLE() macros; this will allow us to change the format and type of the exception table entries. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com	2012-04-20 13:51:38 -07:00
H. Peter Anvin	d7abc0fa99	x86, extable: Remove open-coded exception table entries in arch/x86/kernel/entry_64.S Remove open-coded exception table entries in arch/x86/kernel/entry_64.S, and replace them with _ASM_EXTABLE() macros; this will allow us to change the format and type of the exception table entries. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com	2012-04-20 13:51:38 -07:00
H. Peter Anvin	6837a54dd6	x86, extable: Remove open-coded exception table entries in arch/x86/kernel/entry_32.S Remove open-coded exception table entries in arch/x86/kernel/entry_32.S, and replace them with _ASM_EXTABLE() macros; this will allow us to change the format and type of the exception table entries. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com	2012-04-20 13:51:38 -07:00
H. Peter Anvin	4c5023a3fa	x86-32: Handle exception table entries during early boot If we get an exception during early boot, walk the exception table to see if we should intercept it. The main use case for this is to allow rdmsr_safe()/wrmsr_safe() during CPU initialization. Since the exception table is currently sorted at runtime, and fairly late in startup, this code walks the exception table linearly. We obviously don't need to worry about modules, however: none have been loaded at this point. This patch changes the early IDT setup to look a lot more like x86-64: we now install handlers for all 32 exception vectors. The output of the early exception handler has changed somewhat as it directly reflects the stack frame of the exception handler, and the stack frame has been somewhat restructured. Finally, centralize the code that can and should be run only once. [ v2: Use early_fixup_exception() instead of linear search ] Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1334794610-5546-6-git-send-email-hpa@zytor.com	2012-04-19 16:45:02 -07:00
H. Peter Anvin	9900aa2f95	x86-64: Handle exception table entries during early boot If we get an exception during early boot, walk the exception table to see if we should intercept it. The main use case for this is to allow rdmsr_safe()/wrmsr_safe() during CPU initialization. Since the exception table is currently sorted at runtime, and fairly late in startup, this code walks the exception table linearly. We obviously don't need to worry about modules, however: none have been loaded at this point. [ v2: Use early_fixup_exception() instead of linear search ] Link: http://lkml.kernel.org/r/1334794610-5546-5-git-send-email-hpa@zytor.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-04-19 15:42:45 -07:00
H. Peter Anvin	ffc4bc9c6f	x86, paravirt: Replace GET_CR2_INTO_RCX with GET_CR2_INTO_RAX GET_CR2_INTO_RCX is asinine: it is only used in one place, the actual paravirt call returns the value in %rax, not %rcx; and the one place that wants it wants the result in %r9. We actually generate as a result of this call: call ... movq %rax, %rcx xorq %rax, %rax /* this value isn't even used... */ movq %rcx, %r9 At least make the macro do what the paravirt call does, which is put the value into %rax. Nevermind the fact that the macro clobbers all the volatile registers. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1334794610-5546-4-git-send-email-hpa@zytor.com Cc: Glauber de Oliveira Costa <glommer@parallels.com>	2012-04-19 15:07:56 -07:00
H. Peter Anvin	84f4fc524e	x86: Add symbolic constant for exceptions with error code Add a symbolic constant for the bitmask which states which exceptions carry an error code. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1334794610-5546-3-git-send-email-hpa@zytor.com	2012-04-19 15:07:49 -07:00
Marcelo Tosatti	eac0556750	Merge branch 'linus' into queue Merge reason: development work has dependency on kvm patches merged upstream. Conflicts: Documentation/feature-removal-schedule.txt Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-04-19 17:06:26 -03:00
Srivatsa S. Bhat	a720b2dd24	x86, intel_cacheinfo: Fix error return code in amd_set_l3_disable_slot() If the L3 disable slot is already in use, return -EEXIST instead of -EINVAL. The caller, store_cache_disable(), checks this return value to print an appropriate warning. Also, we want to signal with -EEXIST that the current index we're disabling has actually been already disabled on the node: $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_0 $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_0 -bash: echo: write error: File exists $ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_1 -bash: echo: write error: File exists $ echo 12 > /sys/devices/system/cpu/cpu5/cache/index3/cache_disable_1 -bash: echo: write error: File exists The old code would say -bash: echo: write error: Invalid argument for disable slot 1 when playing the example above with no output in dmesg, which is clearly misleading. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20120419070053.GB16645@elgon.mountain [Boris: add testing for the other index too] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-04-19 18:30:28 +02:00
Tony Luck	95022b8cf6	x86/mce: Avoid reading every machine check bank register twice. Reading machine check bank registers is slow. There is a trend of increasing the number of banks, and the number of cores. The main section of do_machine_check() is a serialized section where each cpu in turn checks every bank. Even on a little two socket SandyBridge-EP system that multiplies out as: 2 sockets * 8 cores * 2 hyperthreads * 20 banks = 640 MSRs We already scan the banks in parallel in mce_no_way_out() to see if there is a fatal error anywhere in the system. If we build a cache of VALID bits during this scan, we can avoid uselessly re-reading banks that have no data. Note that this cache is only a hint. If the valid bit is set in a shared bank, all cpus that share that bank will see it during the parallel scan, but the first to find it in the sequential scan will (usually) clear the bank. Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2012-04-19 09:12:43 -07:00
Bryan O'Donoghue	cbf2829b61	x86, apic: APIC code touches invalid MSR on P5 class machines Current APIC code assumes MSR_IA32_APICBASE is present for all systems. Pentium Classic P5 and friends didn't have this MSR. MSR_IA32_APICBASE was introduced as an architectural MSR by Intel @ P6. Code paths that can touch this MSR invalidly are when vendor == Intel && cpu-family == 5 and APIC bit is set in CPUID - or when you simply pass lapic on the kernel command line, on a P5. The below patch stops Linux incorrectly interfering with the MSR_IA32_APICBASE for P5 class machines. Other code paths exist that touch the MSR - however those paths are not currently reachable for a conformant P5. Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linux.intel.com> Link: http://lkml.kernel.org/r/4F8EEDD3.1080404@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>	2012-04-18 09:44:31 -07:00
Oleg Nesterov	089f9fba56	i387: ptrace breaks the lazy-fpu-restore logic Starting from `7e16838d` "i387: support lazy restore of FPU state" we assume that fpu_owner_task doesn't need restore_fpu_checking() on the context switch, its FPU state should match what we already have in the FPU on this CPU. However, debugger can change the tracee's FPU state, in this case we should reset fpu.last_cpu to ensure fpu_lazy_restore() can't return true. Change init_fpu() to do this, it is called by user_regset->set() methods. Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20120416204815.GB24884@redhat.com Cc: <stable@vger.kernel.org> v3.3 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-04-16 16:23:59 -07:00
Andreas Herrmann	68894632af	x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id() It's only called from amd.c:srat_detect_node(). The introduced condition for calling the fixup code is true for all AMD multi-node processors, e.g. Magny-Cours and Interlagos. There we have 2 NUMA nodes on one socket. Thus there are cores having different numa-node-id but with equal phys_proc_id. There is no point to print error messages in such a situation. The confusing/misleading error message was introduced with commit `64be4c1c24` ("x86: Add x86_init platform override to fix up NUMA core numbering"). Remove the default fixup function (especially the error message) and replace it by a NULL pointer check, move the Numascale-specific condition for calling the fixup into the fixup-function itself and slightly adapt the comment. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: <stable@kernel.org> Cc: <sp@numascale.com> Cc: <bp@amd64.org> Cc: <daniel@numascale-asia.com> Link: http://lkml.kernel.org/r/20120402160648.GR27684@alberich.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-16 20:43:43 +02:00
Josh Triplett	a7e0e4e99f	x86: Fix typo in MODULE_DEVICE_TABLE example: s/x86_cpu/x86cpu/ Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-04-16 14:20:19 +02:00
Andreas Herrmann	d7de8649f3	x86/amd: Remove broken links from comment and kernel message Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Link: http://lkml.kernel.org/r/20120411151238.GA4794@alberich.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-16 08:53:43 +02:00
Ingo Molnar	1c2b443e2a	A two-patch fix from Andreas taking care of a sysfs warning when the microcode driver is loaded on unsupported platforms. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPh/jXAAoJEBLB8Bhh3lVKhFUP/25nGddMi/wl4D76Ef8eJJRu lYVJGqatSCJIV+51DFaFnC9JQ2C10bMfs5it6tNNPTaTpTTss3b/opFs9M2CXJF5 GHY7XlIc7fC7v+Vur9Jgmz4QAldYVYtARDkx6uIuADiD6LvSsug384uFy6Yscdnh s02JTa9PeGYOz0RtTPxzrsGPablDukcfXV/Is/dLNygD+mHs8HdLWNRBDEZfDRea 9nH+zqwa/SbkGPZHbN8wIGLD7V+QMLWbY5YRvOHzlU+L5Hnjfik25+XlsuKKSv+B v6/HNV8YluAGNixIFlm9EkrzNXTMe8smhM50+8b0CjOX/n2ldi89VVheOsgoPaNa dTwyTcrfY71I/B/nyp4pIswEt4P96c2HiZpGO6b22tGxDP7EYtXPKka1v7reBP/O atDmK+3++JOzgaqbKjZkdiLawyzAxHgkLZAr02EsCb0IFF8s6a0HI23s9zOEem3T pifrcYE8j8VgUIvGBUwWWzKONqiXpXPAKU/bwqphQIsnOqIaXOTaCuyDI5PjQX9l Zri/UxXvoBEq+fuobM3W7dhFQ3AnN5h0Ri3L55Eak9DVJU8vQ3ZfbxAwXGBeVtut EDp9LkT4wP0WTY8PfJaZUQoIZ72m0g2sO2eEkcwPuT6ljQb7Rdsr1y5F3KzP4kRF o79lhzWZBAv3cBYvocNS =ZgVv -----END PGP SIGNATURE----- Merge tag 'microcode-fix-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent Pull from Borislav Petkov a two-patch fix from Andreas taking care of a sysfs warning when the microcode driver is loaded on unsupported platforms. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-14 13:30:32 +02:00
Ingo Molnar	6ac1ef482d	Merge branch 'perf/core' into perf/uprobes Merge in latest upstream (and the latest perf development tree), to prepare for tooling changes, and also to pick up v3.4 MM changes that the uprobes code needs to take care of. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-14 13:19:04 +02:00
Will Drewry	c6cfbeb402	x86: Enable HAVE_ARCH_SECCOMP_FILTER Enable support for seccomp filter on x86: - syscall_get_arch() - syscall_get_arguments() - syscall_rollback() - syscall_set_return_value() - SIGSYS siginfo_t support - secure_computing is called from a ptrace_event()-safe context - secure_computing return value is checked (see below). SECCOMP_RET_TRACE and SECCOMP_RET_TRAP may result in seccomp needing to skip a system call without killing the process. This is done by returning a non-zero (-1) value from secure_computing. This change makes x86 respect that return value. To ensure that minimal kernel code is exposed, a non-zero return value results in an immediate return to user space (with an invalid syscall number). Signed-off-by: Will Drewry <wad@chromium.org> Reviewed-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Eric Paris <eparis@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> v18: rebase and tweaked change description, acked-by v17: added reviewed by and rebased v..: all rebases since original introduction. Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-04-14 11:13:21 +10:00
Andreas Herrmann	283c1f2558	x86, microcode: Ensure that module is only loaded on supported AMD CPUs Exit early when there's no support for a particular CPU family. Also, fixup the "no support for this CPU vendor" to be issued only when the driver is attempted to be loaded on an unsupported vendor. Cc: stable@vger.kernel.org Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: http://lkml.kernel.org/r/20120411163849.GE4794@alberich.amd.com [Boris: add a commit msg because Andreas is lazy] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-04-13 11:51:05 +02:00
Andreas Herrmann	a956bd6f85	x86, microcode: Fix sysfs warning during module unload on unsupported CPUs Loading the microcode driver on an unsupported CPU and subsequently unloading the driver causes WARNING: at fs/sysfs/group.c:138 mc_device_remove+0x5f/0x70 [microcode]() Hardware name: 01972NG sysfs group ffffffffa00013d0 not found for kobject 'cpu0' Modules linked in: snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel btusb snd_hda_codec bluetooth thinkpad_acpi rfkill microcode(-) [last unloaded: cfg80211] Pid: 4560, comm: modprobe Not tainted 3.4.0-rc2-00002-g258f742 #5 Call Trace: [<ffffffff8103113b>] ? warn_slowpath_common+0x7b/0xc0 [<ffffffff81031235>] ? warn_slowpath_fmt+0x45/0x50 [<ffffffff81120e74>] ? sysfs_remove_group+0x34/0x120 [<ffffffffa00000ef>] ? mc_device_remove+0x5f/0x70 [microcode] [<ffffffff81331eb9>] ? subsys_interface_unregister+0x69/0xa0 [<ffffffff81563526>] ? mutex_lock+0x16/0x40 [<ffffffffa0000c3e>] ? microcode_exit+0x50/0x92 [microcode] [<ffffffff8107051d>] ? sys_delete_module+0x16d/0x260 [<ffffffff810a0065>] ? wait_iff_congested+0x45/0x110 [<ffffffff815656af>] ? page_fault+0x1f/0x30 [<ffffffff81565ba2>] ? system_call_fastpath+0x16/0x1b on recent kernels. This is due to commit `8a25a2fd12` ("cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem") which renders commit `6c53cbfced` ("x86, microcode: Correct sysdev_add error path") useless. See http://marc.info/?l=linux-kernel&m=133416246406478 Avoid above warning by restoring the old driver behaviour before `6c53cbfced` ("x86, microcode: Correct sysdev_add error path"). Cc: stable@vger.kernel.org Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: http://lkml.kernel.org/r/20120411163849.GE4794@alberich.amd.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-04-13 11:49:06 +02:00
Linus Torvalds	4a1d7544fe	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Use correct byte-sized register constraint in __add() x86: Use correct byte-sized register constraint in __xchg_op() x86: vsyscall: Use NULL instead 0 for a pointer argument	2012-04-12 15:06:07 -07:00
Eric B Munson	248997095d	kvmclock: remove unneeded EXPORT macro check_and_clear_guest_paused does not need to be exported as it isn't used by any modules, remove the export. Signed-off-by: Eric B Munson <emunson@mgebm.net> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:49:54 +03:00
Eric B Munson	3b5d56b931	kvmclock: Add functions to check if the host has stopped the vm When a host stops or suspends a VM it will set a flag to show this. The watchdog will use these functions to determine if a softlockup is real, or the result of a suspended VM. Signed-off-by: Eric B Munson <emunson@mgebm.net> asm-generic changes Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:48:59 +03:00
Linus Torvalds	a3fac08085	Merge branch 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull a few KVM fixes from Avi Kivity: "A bunch of powerpc KVM fixes, a guest and a host RCU fix (unrelated), and a small build fix." * 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Resolve RCU vs. async page fault problem KVM: VMX: vmx_set_cr0 expects kvm->srcu locked KVM: PMU: Fix integer constant is too large warning in kvm_pmu_set_msr() KVM: PPC: Book3S: PR: Fix preemption KVM: PPC: Save/Restore CR over vcpu_run KVM: PPC: Book3S HV: Save and restore CR in __kvmppc_vcore_entry KVM: PPC: Book3S HV: Fix kvm_alloc_linear in case where no linears exist KVM: PPC: Book3S: Compile fix for ppc32 in HIOR access code	2012-04-07 09:53:33 -07:00
Emil Goode	46ed99d1b7	x86: vsyscall: Use NULL instead 0 for a pointer argument This patch silences the following sparse warning: arch/x86/kernel/vsyscall_64.c:250:34: warning: Using plain integer as NULL pointer Signed-off-by: Emil Goode <emilgoode@gmail.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: john.stultz@linaro.org Link: http://lkml.kernel.org/r/1333306084-3776-1-git-send-email-emilgoode@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-04-06 11:49:59 +02:00
Linus Torvalds	5d32c88f0b	Merge branch 'akpm' (Andrew's patch-bomb) Merge batch of fixes from Andrew Morton: "The simple_open() cleanup was held back while I wanted for laggards to merge things. I still need to send a few checkpoint/restore patches. I've been wobbly about merging them because I'm wobbly about the overall prospects for success of the project. But after speaking with Pavel at the LSF conference, it sounds like they're further toward completion than I feared - apparently davem is at the "has stopped complaining" stage regarding the net changes. So I need to go back and re-review those patchs and their (lengthy) discussion." * emailed from Andrew Morton <akpm@linux-foundation.org>: (16 patches) memcg swap: use mem_cgroup_uncharge_swap fix backlight: add driver for DA9052/53 PMIC v1 C6X: use set_current_blocked() and block_sigmask() MAINTAINERS: add entry for sparse checker MAINTAINERS: fix REMOTEPROC F: typo alpha: use set_current_blocked() and block_sigmask() simple_open: automatically convert to simple_open() scripts/coccinelle/api/simple_open.cocci: semantic patch for simple_open() libfs: add simple_open() hugetlbfs: remove unregister_filesystem() when initializing module drivers/rtc/rtc-88pm860x.c: fix rtc irq enable callback fs/xattr.c:setxattr(): improve handling of allocation failures fs/xattr.c:listxattr(): fall back to vmalloc() if kmalloc() failed fs/xattr.c: suppress page allocation failure warnings from sys_listxattr() sysrq: use SEND_SIG_FORCED instead of force_sig() proc: fix mount -t proc -o AAA	2012-04-05 15:30:34 -07:00
Stephen Boyd	234e340582	simple_open: automatically convert to simple_open() Many users of debugfs copy the implementation of default_open() when they want to support a custom read/write function op. This leads to a proliferation of the default_open() implementation across the entire tree. Now that the common implementation has been consolidated into libfs we can replace all the users of this function with simple_open(). This replacement was done with the following semantic patch: <smpl> @ open @ identifier open_f != simple_open; identifier i, f; @@ -int open_f(struct inode i, struct file f) -{ ( -if (i->i_private) -f->private_data = i->i_private; \| -f->private_data = i->i_private; ) -return 0; -} @ has_open depends on open @ identifier fops; identifier open.open_f; @@ struct file_operations fops = { ... -.open = open_f, +.open = simple_open, ... }; </smpl> [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-04-05 15:25:50 -07:00
Gleb Natapov	e08759215b	KVM: Resolve RCU vs. async page fault problem "Page ready" async PF can kick vcpu out of idle state much like IRQ. We need to tell RCU about this. Reported-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-05 19:26:14 +03:00
Linus Torvalds	6c216ec636	KGDB/KDB regression fixes 3.4: Fix an an Smatch warning that appeared in the 3.4 merge window 3.0: Fix kgdb test suite with SMP for all archs without HW single stepping 2.6.36: Fix kgdb sw breakpoints with CONFIG_DEBUG_RODATA=y limitations on x86 2.6.26: Fix oops on kgdb test suite with CONFIG_DEBUG_RODATA Fix kgdb test suite with SMP for all archs with HW single stepping -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPedocAAoJEIciOldedpOjn7EP/397Rh0zmRlG8oQwMEJcK3E5 gaRyBNpkGoU3ekHXHx/nzgQ/CS9opzW7nBZDu8weWLjRKMT4RyHfuJcWyu525GvQ SnoiX2ZUzP315d8llCYwXmaCEYA7lHQi4T2bGMlDSn1J8kS235EQxllgEfhXDdEC DxRWgHABG2UR62C62sGKbPaMMDO9TcNcrAQK27LDLTS7pKLmYqBWBdZKgWzBM/Pr AF8vakqSgUw3Aq9qrLge+483uT7uhMoUJofxRppWtm1QgnDcTmri9LOagiazDotz RQliRGwVxj9hEo5mLEiQtI0N1kIGCAsK0+9aUJEZRXovRBR9kvqaqHT4c5xdhznr VKYvqqTcHBkKLIfNXFvQZnn2cXtNVNqve9CZZwdBJaFYEkaR7ZVQqE6f2xq8KAb2 RmhvzlEUyLU+89YKkH66uSa22VLSazkeH+4b8AJ4JxYDEab3BHoBCe8axcBQrTsj 7X5NOs7V3Oj+4J3bS1fbUbxq4t0dfpLLyg8e/lELWtT+Kq7nQRzA2XHRZAMTve8M T0cTdrwtUbgY9ZMTpywNB2KlPgTvhWOyfYbH6/Kcks7ecSXlkow3edXoiUbw79iE hP8vcMWbT2Rv3IbLkSMFZEQGAG9qL1YyGv4NDmLOoljO1c/Bi3WQIR5aI+di6asV Z5q5s/bmGa4+OhFFITSd =SW2N -----END PGP SIGNATURE----- Merge tag 'for_linus-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb Pull KGDB/KDB regression fixes from Jason Wessel: - Fix a Smatch warning that appeared in the 3.4 merge window - Fix kgdb test suite with SMP for all archs without HW single stepping - Fix kgdb sw breakpoints with CONFIG_DEBUG_RODATA=y limitations on x86 - Fix oops on kgdb test suite with CONFIG_DEBUG_RODATA - Fix kgdb test suite with SMP for all archs with HW single stepping * tag 'for_linus-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb: x86,kgdb: Fix DEBUG_RODATA limitation using text_poke() kgdb,debug_core: pass the breakpoint struct instead of address and memory kgdbts: (2 of 2) fix single step awareness to work correctly with SMP kgdbts: (1 of 2) fix single step awareness to work correctly with SMP kgdbts: Fix kernel oops with CONFIG_DEBUG_RODATA kdb: Fix smatch warning on dbg_io_ops->is_console	2012-04-04 17:26:08 -07:00
Linus Torvalds	58bca4a8fa	Merge branch 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping Pull DMA mapping branch from Marek Szyprowski: "Short summary for the whole series: A few limitations have been identified in the current dma-mapping design and its implementations for various architectures. There exist more than one function for allocating and freeing the buffers: currently these 3 are used dma_{alloc, free}_coherent, dma_{alloc,free}_writecombine, dma_{alloc,free}_noncoherent. For most of the systems these calls are almost equivalent and can be interchanged. For others, especially the truly non-coherent ones (like ARM), the difference can be easily noticed in overall driver performance. Sadly not all architectures provide implementations for all of them, so the drivers might need to be adapted and cannot be easily shared between different architectures. The provided patches unify all these functions and hide the differences under the already existing dma attributes concept. The thread with more references is available here: http://www.spinics.net/lists/linux-sh/msg09777.html These patches are also a prerequisite for unifying DMA-mapping implementation on ARM architecture with the common one provided by dma_map_ops structure and extending it with IOMMU support. More information is available in the following thread: http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819 More works on dma-mapping framework are planned, especially in the area of buffer sharing and managing the shared mappings (together with the recently introduced dma_buf interface: commit `d15bd7ee44` "dma-buf: Introduce dma buffer sharing mechanism"). The patches in the current set introduce a new alloc/free methods (with support for memory attributes) in dma_map_ops structure, which will later replace dma_alloc_coherent and dma_alloc_writecombine functions." People finally started piping up with support for merging this, so I'm merging it as the last of the pending stuff from the merge window. Looks like pohmelfs is going to wait for 3.5 and more external support for merging. * 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping: common: DMA-mapping: add NON-CONSISTENT attribute common: DMA-mapping: add WRITE_COMBINE attribute common: dma-mapping: introduce mmap method common: dma-mapping: remove old alloc_coherent and free_coherent methods Hexagon: adapt for dma_map_ops changes Unicore32: adapt for dma_map_ops changes Microblaze: adapt for dma_map_ops changes SH: adapt for dma_map_ops changes Alpha: adapt for dma_map_ops changes SPARC: adapt for dma_map_ops changes PowerPC: adapt for dma_map_ops changes MIPS: adapt for dma_map_ops changes X86 & IA64: adapt for dma_map_ops changes common: dma-mapping: introduce generic alloc() and free() methods	2012-04-04 17:13:43 -07:00
Linus Torvalds	66cfb32772	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/p4: Add format attributes tracing, sched, vfs: Fix 'old_pid' usage in trace_sched_process_exec()	2012-04-04 10:04:42 -07:00
Linus Torvalds	6742259866	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, kvm: Call restore_sched_clock_state() only after %gs is initialized x86: Use -mno-avx when available x86: Remove the ancient and deprecated disable_hlt() and enable_hlt() facility x86: Preserve lazy irq disable semantics in fixup_irqs()	2012-04-04 10:04:01 -07:00
Peter Zijlstra	7b8e6da46b	perf/x86/p4: Add format attributes Steven reported his P4 not booting properly, the missing format attributes cause a NULL ptr deref. Cure this by adding the missing format specification. I took the format description out of the comment near p4_config_pack*() and hope that comment is still relatively accurate. Reported-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Bruno Prémont <bonbons@linux-vserver.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1332859842.16159.227.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-03 08:33:38 +02:00
Linus Torvalds	f187e9fd68	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates and fixes from Ingo Molnar: "It's mostly fixes, but there's also two late items: - preliminary GTK GUI support for perf report - PMU raw event format descriptors in sysfs, to be parsed by tooling The raw event format in sysfs is a new ABI. For example for the 'CPU' PMU we have: aldebaran:~> ll /sys/bus/event_source/devices/cpu/format/* -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/any -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/cmask -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/edge -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/event -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/inv -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/offcore_rsp -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/pc -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/umask those lists of fields contain a specific format: aldebaran:~> cat /sys/bus/event_source/devices/cpu/format/offcore_rsp config1:0-63 So, those who wish to specify raw events can now use the following event format: -e cpu/cmask=1,event=2,umask=3 Most people will not want to specify any events (let alone raw events), they'll just use whatever default event the tools use. But for more obscure PMU events that have no cross-architecture generic events the above syntax is more usable and a bit more structured than specifying hex numbers." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) perf tools: Remove auto-generated bison/flex files perf annotate: Fix off by one symbol hist size allocation and hit accounting perf tools: Add missing ref-cycles event back to event parser perf annotate: addr2line wants addresses in same format as objdump perf probe: Finder fails to resolve function name to address tracing: Fix ent_size in trace output perf symbols: Handle NULL dso in dso__name_len perf symbols: Do not include libgen.h perf tools: Fix bug in raw sample parsing perf tools: Fix display of first level of callchains perf tools: Switch module.h into export.h perf: Move mmap page data_head offset assertion out of header perf: Fix mmap_page capabilities and docs perf diff: Fix to work with new hists design perf tools: Fix modifier to be applied on correct events perf tools: Fix various casting issues for 32 bits perf tools: Simplify event_read_id exit path tracing: Fix ftrace stack trace entries tracing: Move the tracing_on/off() declarations into CONFIG_TRACING perf report: Add a simple GTK2-based 'perf report' browser ...	2012-03-31 13:34:04 -07:00
Linus Torvalds	a335750b9a	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull ACPI & Power Management changes from Len Brown: - ACPI 5.0 after-ripples, ACPICA/Linux divergence cleanup - cpuidle evolving, more ARM use - thermal sub-system evolving, ditto - assorted other PM bits Fix up conflicts in various cpuidle implementations due to ARM cpuidle cleanups (ARM at91 self-refresh and cpu idle code rewritten into "standby" in asm conflicting with the consolidation of cpuidle time keeping), trivial SH include file context conflict and RCU tracing fixes in generic code. * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (77 commits) ACPI throttling: fix endian bug in acpi_read_throttling_status() Disable MCP limit exceeded messages from Intel IPS driver ACPI video: Don't start video device until its associated input device has been allocated ACPI video: Harden video bus adding. ACPI: Add support for exposing BGRT data ACPI: export acpi_kobj ACPI: Fix logic for removing mappings in 'acpi_unmap' CPER failed to handle generic error records with multiple sections ACPI: Clean redundant codes in scan.c ACPI: Fix unprotected smp_processor_id() in acpi_processor_cst_has_changed() ACPI: consistently use should_use_kmap() PNPACPI: Fix device ref leaking in acpi_pnp_match ACPI: Fix use-after-free in acpi_map_lsapic ACPI: processor_driver: add missing kfree ACPI, APEI: Fix incorrect APEI register bit width check and usage Update documentation for parameter notrigger in einj.txt ACPI, APEI, EINJ, new parameter to control trigger action ACPI, APEI, EINJ, limit the range of einj_param ACPI, APEI, Fix ERST header length check cpuidle: power_usage should be declared signed integer ...	2012-03-30 16:45:39 -07:00
Len Brown	d326f44e5f	Merge branch 'tboot' into release Conflicts: drivers/acpi/acpica/hwsleep.c Text conflict between: `2feec47d4c` (ACPICA: ACPI 5: Support for new FADT SleepStatus, SleepControl registers) which removed #include "actables.h" and `09f98a825a` (x86, acpi, tboot: Have a ACPI os prepare sleep instead of calling tboot_sleep.) which removed #include <linux/tboot.h> The resolution is to remove them both. Signed-off-by: Len Brown <len.brown@intel.com>	2012-03-30 16:38:59 -04:00
Len Brown	1a05e46787	Merge branches 'acpica', 'bgrt', 'bz-11533', 'cpuidle', 'ec', 'hotplug', 'misc', 'red-hat-bz-727865', 'thermal', 'throttling', 'turbostat' and 'video' into release Signed-off-by: Len Brown <len.brown@intel.com>	2012-03-30 16:10:37 -04:00
Petr Vandrovec	ac909ec308	ACPI: Fix use-after-free in acpi_map_lsapic When processor is being hot-added to the system, acpi_map_lsapic invokes ACPI _MAT method to find APIC ID and flags, verifies that returned structure is indeed ACPI's local APIC structure, and that flags contain MADT_ENABLED bit. Then saves APIC ID, frees structure - and accesses structure when computing arguments for acpi_register_lapic call. Which sometime leads to acpi_register_lapic call being made with second argument zero, failing to bring processor online with error 'Unable to map lapic to logical cpu number'. As lapic->lapic_flags & ACPI_MADT_ENABLED was already confirmed to be non-zero few lines above, we can just pass unconditional ACPI_MADT_ENABLED to the acpi_register_lapic. Signed-off-by: Petr Vandrovec <petr@vmware.com> Signed-off-by: Alok N Kataria <akataria@vmware.com> Reviewed-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Len Brown <len.brown@intel.com>	2012-03-30 03:31:58 -04:00
Boris Ostrovsky	1a022e3f1b	idle, x86: Allow off-lined CPU to enter deeper C states Currently when a CPU is off-lined it enters either MWAIT-based idle or, if MWAIT is not desired or supported, HLT-based idle (which places the processor in C1 state). This patch allows processors without MWAIT support to stay in states deeper than C1. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com> Signed-off-by: Len Brown <len.brown@intel.com>	2012-03-30 03:23:01 -04:00
Len Brown	f6365201d8	x86: Remove the ancient and deprecated disable_hlt() and enable_hlt() facility The X86_32-only disable_hlt/enable_hlt mechanism was used by the 32-bit floppy driver. Its effect was to replace the use of the HLT instruction inside default_idle() with cpu_relax() - essentially it turned off the use of HLT. This workaround was commented in the code as: "disable hlt during certain critical i/o operations" "This halt magic was a workaround for ancient floppy DMA wreckage. It should be safe to remove." H. Peter Anvin additionally adds: "To the best of my knowledge, no-hlt only existed because of flaky power distributions on 386/486 systems which were sold to run DOS. Since DOS did no power management of any kind, including HLT, the power draw was fairly uniform; when exposed to the much hhigher noise levels you got when Linux used HLT caused some of these systems to fail. They were by far in the minority even back then." Alan Cox further says: "Also for the Cyrix 5510 which tended to go castors up if a HLT occurred during a DMA cycle and on a few other boxes HLT during DMA tended to go astray. Do we care ? I doubt it. The 5510 was pretty obscure, the 5520 fixed it, the 5530 is probably the oldest still in any kind of use." So, let's finally drop this. Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Josh Boyer <jwboyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Stephen Hemminger <shemminger@vyatta.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/n/tip-3rhk9bzf0x9rljkv488tloib@git.kernel.org [ If anyone cares then alternative instruction patching could be used to replace HLT with a one-byte NOP instruction. Much simpler. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-30 08:50:27 +02:00
Ingo Molnar	186e54cbe1	Merge branch 'linus' into x86/urgent Merge reason: Needed for include file dependencies. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-30 08:50:06 +02:00
Linus Torvalds	eb05df9e7e	Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Peter Anvin: "The biggest textual change is the cleanup to use symbolic constants for x86 trap values. The only functional change and the reason for the x86/x32 dependency is the move of is_ia32_task() into <asm/thread_info.h> so that it can be used in other code that needs to understand if a system call comes from the compat entry point (and therefore uses i386 system call numbers) or not. One intended user for that is the BPF system call filter. Moving it out of <asm/compat.h> means we can define it unconditionally, returning always true on i386." * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Move is_ia32_task to asm/thread_info.h from asm/compat.h x86: Rename trap_no to trap_nr in thread_struct x86: Use enum instead of literals for trap values	2012-03-29 18:21:35 -07:00
Linus Torvalds	a591afc01d	Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x32 support for x86-64 from Ingo Molnar: "This tree introduces the X32 binary format and execution mode for x86: 32-bit data space binaries using 64-bit instructions and 64-bit kernel syscalls. This allows applications whose working set fits into a 32 bits address space to make use of 64-bit instructions while using a 32-bit address space with shorter pointers, more compressed data structures, etc." Fix up trivial context conflicts in arch/x86/{Kconfig,vdso/vma.c} * 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits) x32: Fix alignment fail in struct compat_siginfo x32: Fix stupid ia32/x32 inversion in the siginfo format x32: Add ptrace for x32 x32: Switch to a 64-bit clock_t x32: Provide separate is_ia32_task() and is_x32_task() predicates x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls x86/x32: Fix the binutils auto-detect x32: Warn and disable rather than error if binutils too old x32: Only clear TIF_X32 flag once x32: Make sure TS_COMPAT is cleared for x32 tasks fs: Remove missed ->fds_bits from cessation use of fd_set structs internally fs: Fix close_on_exec pointer in alloc_fdtable x32: Drop non-__vdso weak symbols from the x32 VDSO x32: Fix coding style violations in the x32 VDSO code x32: Add x32 VDSO support x32: Allow x32 to be configured x32: If configured, add x32 system calls to system call tables x32: Handle process creation x32: Signal-related system calls x86: Add #ifdef CONFIG_COMPAT to <asm/sys_ia32.h> ...	2012-03-29 18:12:23 -07:00
Jason Wessel	3751d3e85c	x86,kgdb: Fix DEBUG_RODATA limitation using text_poke() There has long been a limitation using software breakpoints with a kernel compiled with CONFIG_DEBUG_RODATA going back to 2.6.26. For this particular patch, it will apply cleanly and has been tested all the way back to 2.6.36. The kprobes code uses the text_poke() function which accommodates writing a breakpoint into a read-only page. The x86 kgdb code can solve the problem similarly by overriding the default breakpoint set/remove routines and using text_poke() directly. The x86 kgdb code will first attempt to use the traditional probe_kernel_write(), and next try using a the text_poke() function. The break point install method is tracked such that the correct break point removal routine will get called later on. Cc: x86@kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: stable@vger.kernel.org # >= 2.6.36 Inspried-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>	2012-03-29 17:41:25 -05:00
Linus Torvalds	7fda0412c5	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar. * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpusets: Remove an unused variable sched/rt: Improve pick_next_highest_task_rt() sched: Fix select_fallback_rq() vs cpu_active/cpu_online sched/x86/smp: Do not enable IRQs over calibrate_delay() sched: Fix compiler warning about declared inline after use MAINTAINERS: Update email address for SCHEDULER and PERF EVENTS	2012-03-29 14:46:05 -07:00
Linus Torvalds	6b8212a313	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 updates from Ingo Molnar. This touches some non-x86 files due to the sanitized INLINE_SPIN_UNLOCK config usage. Fixed up trivial conflicts due to just header include changes (removing headers due to cpu_idle() merge clashing with the <asm/system.h> split). * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic/amd: Be more verbose about LVT offset assignments x86, tls: Off by one limit check x86/ioapic: Add io_apic_ops driver layer to allow interception x86/olpc: Add debugfs interface for EC commands x86: Merge the x86_32 and x86_64 cpu_idle() functions x86/kconfig: Remove CONFIG_TR=y from the defconfigs x86: Stop recursive fault in print_context_stack after stack overflow x86/io_apic: Move and reenable irq only when CONFIG_GENERIC_PENDING_IRQ=y x86/apic: Add separate apic_id_valid() functions for selected apic drivers locking/kconfig: Simplify INLINE_SPIN_UNLOCK usage x86/kconfig: Update defconfigs x86: Fix excessive MSR print out when show_msr is not specified	2012-03-29 14:28:26 -07:00
Linus Torvalds	bcd550745f	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer core updates from Thomas Gleixner. * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ia64: vsyscall: Add missing paranthesis alarmtimer: Don't call rtc_timer_init() when CONFIG_RTC_CLASS=n x86: vdso: Put declaration before code x86-64: Inline vdso clock_gettime helpers x86-64: Simplify and optimize vdso clock_gettime monotonic variants kernel-time: fix s/then/than/ spelling errors time: remove no_sync_cmos_clock time: Avoid scary backtraces when warning of > 11% adj alarmtimer: Make sure we initialize the rtctimer ntp: Fix leap-second hrtimer livelock x86, tsc: Skip refined tsc calibration on systems with reliable TSC rtc: Provide flag for rtc devices that don't support UIE ia64: vsyscall: Use seqcount instead of seqlock x86: vdso: Use seqcount instead of seqlock x86: vdso: Remove bogus locking in update_vsyscall_tz() time: Remove bogus comments time: Fix change_clocksource locking time: x86: Fix race switching from vsyscall to non-vsyscall clock	2012-03-29 14:16:48 -07:00
Liu, Chuansheng	99dd5497e5	x86: Preserve lazy irq disable semantics in fixup_irqs() The default irq_disable() sematics are to mark the interrupt disabled, but keep it unmasked. If the interrupt is delivered while marked disabled, the low level interrupt handler masks it and marks it pending. This is important for detecting wakeup interrupts during suspend and for edge type interrupts to avoid losing interrupts. fixup_irqs() moves the interrupts away from an offlined cpu. For certain interrupt types it needs to mask the interrupt line before changing the affinity. After affinity has changed the interrupt line is unmasked again, but only if it is not marked disabled. This breaks the lazy irq disable semantics and causes problems in suspend as the interrupt can be lost or wakeup functionality is broken. Check irqd_irq_masked() instead of irqd_irq_disabled() because irqd_irq_masked() is only set, when the core code actually masked the interrupt line. If it's not set, we unmask the interrupt and let the lazy irq disable logic deal with an eventually incoming interrupt. [ tglx: Massaged changelog and added a comment ] Signed-off-by: liu chuansheng <chuansheng.liu@intel.com> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Link: http://lkml.kernel.org/r/27240C0AC20F114CBF8149A2696CBE4A05DFB3@SHSMSX101.ccr.corp.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-03-29 15:28:47 +02:00
Linus Torvalds	532bfc851a	Merge branch 'akpm' (Andrew's patch-bomb) Merge third batch of patches from Andrew Morton: - Some MM stragglers - core SMP library cleanups (on_each_cpu_mask) - Some IPI optimisations - kexec - kdump - IPMI - the radix-tree iterator work - various other misc bits. "That'll do for -rc1. I still have ~10 patches for 3.4, will send those along when they've baked a little more." * emailed from Andrew Morton <akpm@linux-foundation.org>: (35 commits) backlight: fix typo in tosa_lcd.c crc32: add help text for the algorithm select option mm: move hugepage test examples to tools/testing/selftests/vm mm: move slabinfo.c to tools/vm mm: move page-types.c from Documentation to tools/vm selftests/Makefile: make `run_tests' depend on `all' selftests: launch individual selftests from the main Makefile radix-tree: use iterators in find_get_pages* functions radix-tree: rewrite gang lookup using iterator radix-tree: introduce bit-optimized iterator fs/proc/namespaces.c: prevent crash when ns_entries[] is empty nbd: rename the nbd_device variable from lo to nbd pidns: add reboot_pid_ns() to handle the reboot syscall sysctl: use bitmap library functions ipmi: use locks on watchdog timeout set on reboot ipmi: simplify locking ipmi: fix message handling during panics ipmi: use a tasklet for handling received messages ipmi: increase KCS timeouts ipmi: decrease the IPMI message transaction time in interrupt mode ...	2012-03-28 17:19:28 -07:00
Dave Young	09c71bfd83	kdump x86: fix total mem size calculation for reservation crashkernel reservation need know the total memory size. Current get_total_mem simply use max_pfn - min_low_pfn. It is wrong because it will including memory holes in the middle. Especially for kvm guest with memory > 0xe0000000, there's below in qemu code: qemu split memory as below: if (ram_size >= 0xe0000000 ) { above_4g_mem_size = ram_size - 0xe0000000; below_4g_mem_size = 0xe0000000; } else { below_4g_mem_size = ram_size; } So for 4G mem guest, seabios will insert a 512M usable region beyond of 4G. Thus in above case max_pfn - min_low_pfn will be more than original memsize. Fixing this issue by using memblock_phys_mem_size() to get the total memsize. Signed-off-by: Dave Young <dyoung@redhat.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-28 17:14:36 -07:00
Linus Torvalds	0195c00244	Disintegrate and delete asm/system.h -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIVAwUAT3NKzROxKuMESys7AQKElw/+JyDxJSlj+g+nymkx8IVVuU8CsEwNLgRk 8KEnRfLhGtkXFLSJYWO6jzGo16F8Uqli1PdMFte/wagSv0285/HZaKlkkBVHdJ/m u40oSjgT013bBh6MQ0Oaf8pFezFUiQB5zPOA9QGaLVGDLXCmgqUgd7exaD5wRIwB ZmyItjZeAVnDfk1R+ZiNYytHAi8A5wSB+eFDCIQYgyulA1Igd1UnRtx+dRKbvc/m rWQ6KWbZHIdvP1ksd8wHHkrlUD2pEeJ8glJLsZUhMm/5oMf/8RmOCvmo8rvE/qwl eDQ1h4cGYlfjobxXZMHqAN9m7Jg2bI946HZjdb7/7oCeO6VW3FwPZ/Ic75p+wp45 HXJTItufERYk6QxShiOKvA+QexnYwY0IT5oRP4DrhdVB/X9cl2MoaZHC+RbYLQy+ /5VNZKi38iK4F9AbFamS7kd0i5QszA/ZzEzKZ6VMuOp3W/fagpn4ZJT1LIA3m4A9 Q0cj24mqeyCfjysu0TMbPtaN+Yjeu1o1OFRvM8XffbZsp5bNzuTDEvviJ2NXw4vK 4qUHulhYSEWcu9YgAZXvEWDEM78FXCkg2v/CrZXH5tyc95kUkMPcgG+QZBB5wElR FaOKpiC/BuNIGEf02IZQ4nfDxE90QwnDeoYeV+FvNj9UEOopJ5z5bMPoTHxm4cCD NypQthI85pc= =G9mT -----END PGP SIGNATURE----- Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system Pull "Disintegrate and delete asm/system.h" from David Howells: "Here are a bunch of patches to disintegrate asm/system.h into a set of separate bits to relieve the problem of circular inclusion dependencies. I've built all the working defconfigs from all the arches that I can and made sure that they don't break. The reason for these patches is that I recently encountered a circular dependency problem that came about when I produced some patches to optimise get_order() by rewriting it to use ilog2(). This uses bitops - and on the SH arch asm/bitops.h drags in asm-generic/get_order.h by a circuituous route involving asm/system.h. The main difficulty seems to be asm/system.h. It holds a number of low level bits with no/few dependencies that are commonly used (eg. memory barriers) and a number of bits with more dependencies that aren't used in many places (eg. switch_to()). These patches break asm/system.h up into the following core pieces: (1) asm/barrier.h Move memory barriers here. This already done for MIPS and Alpha. (2) asm/switch_to.h Move switch_to() and related stuff here. (3) asm/exec.h Move arch_align_stack() here. Other process execution related bits could perhaps go here from asm/processor.h. (4) asm/cmpxchg.h Move xchg() and cmpxchg() here as they're full word atomic ops and frequently used by atomic_xchg() and atomic_cmpxchg(). (5) asm/bug.h Move die() and related bits. (6) asm/auxvec.h Move AT_VECTOR_SIZE_ARCH here. Other arch headers are created as needed on a per-arch basis." Fixed up some conflicts from other header file cleanups and moving code around that has happened in the meantime, so David's testing is somewhat weakened by that. We'll find out anything that got broken and fix it.. * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits) Delete all instances of asm/system.h Remove all #inclusions of asm/system.h Add #includes needed to permit the removal of asm/system.h Move all declarations of free_initmem() to linux/mm.h Disintegrate asm/system.h for OpenRISC Split arch_align_stack() out from asm-generic/system.h Split the switch_to() wrapper out of asm-generic/system.h Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h Create asm-generic/barrier.h Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h Disintegrate asm/system.h for Xtensa Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt] Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Sparc Disintegrate asm/system.h for SH Disintegrate asm/system.h for Score Disintegrate asm/system.h for S390 Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PA-RISC Disintegrate asm/system.h for MN10300 ...	2012-03-28 15:58:21 -07:00
Linus Torvalds	2e7580b0e7	Merge branch 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Avi Kivity: "Changes include timekeeping improvements, support for assigning host PCI devices that share interrupt lines, s390 user-controlled guests, a large ppc update, and random fixes." This is with the sign-off's fixed, hopefully next merge window we won't have rebased commits. * 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits) KVM: Convert intx_mask_lock to spin lock KVM: x86: fix kvm_write_tsc() TSC matching thinko x86: kvmclock: abstract save/restore sched_clock_state KVM: nVMX: Fix erroneous exception bitmap check KVM: Ignore the writes to MSR_K7_HWCR(3) KVM: MMU: make use of ->root_level in reset_rsvds_bits_mask KVM: PMU: add proper support for fixed counter 2 KVM: PMU: Fix raw event check KVM: PMU: warn when pin control is set in eventsel msr KVM: VMX: Fix delayed load of shared MSRs KVM: use correct tlbs dirty type in cmpxchg KVM: Allow host IRQ sharing for assigned PCI 2.3 devices KVM: Ensure all vcpus are consistent with in-kernel irqchip settings KVM: x86 emulator: Allow PM/VM86 switch during task switch KVM: SVM: Fix CPL updates KVM: x86 emulator: VM86 segments must have DPL 3 KVM: x86 emulator: Fix task switch privilege checks arch/powerpc/kvm/book3s_hv.c: included linux/sched.h twice KVM: x86 emulator: correctly mask pmc index bits in RDPMC instruction emulation KVM: mmu_notifier: Flush TLBs before releasing mmu_lock ...	2012-03-28 14:35:31 -07:00
Robert Richter	8abc3122aa	x86/apic/amd: Be more verbose about LVT offset assignments Add information about LVT offset assignments to better debug firmware bugs related to this. See following examples. # dmesg \| grep -i 'offset\\|ibs' LVT offset 0 assigned for vector 0xf9 [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu [Firmware Bug]: cpu 0, IBS interrupt offset 0 not available (MSRC001103A=0x0000000000000100) Failed to setup IBS, -22 In this case the BIOS assigns both offsets for MCE (0xf9) and IBS (0x400) vectors to offset 0, which is why the second APIC setup (IBS) failed. With correct setup you get: # dmesg \| grep -i 'offset\\|ibs' LVT offset 0 assigned for vector 0xf9 LVT offset 1 assigned for vector 0x400 IBS: LVT offset 1 assigned perf: AMD IBS detected (0x00000007) oprofile: AMD IBS detected (0x00000007) Note: The vector includes also the message type to handle also NMIs (0x400). In the firmware bug message the format is the same as of the APIC500 register and includes the mask bit (bit 16) in addition. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-28 20:02:39 +02:00
David Howells	f05e798ad4	Disintegrate asm/system.h for X86 Disintegrate asm/system.h for X86. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> cc: x86@kernel.org	2012-03-28 18:11:12 +01:00
Dan Carpenter	8f0750f197	x86, tls: Off by one limit check These are used as offsets into an array of GDT_ENTRY_TLS_ENTRIES members so GDT_ENTRY_TLS_ENTRIES is one past the end of the array. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: http://lkml.kernel.org/r/20120324075250.GA28258@elgon.mountain Cc: <stable@vger.kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-03-28 09:39:35 -07:00
Andrzej Pietrasiewicz	baa676fcf8	X86 & IA64: adapt for dma_map_ops changes Adapt core x86 and IA64 architecture code for dma_map_ops changes: replace alloc/free_coherent with generic alloc/free methods. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Acked-by: Kyungmin Park <kyungmin.park@samsung.com> [removed swiotlb related changes and replaced it with wrappers, merged with IA64 patch to avoid inter-patch dependences in intel-iommu code] Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Tony Luck <tony.luck@intel.com>	2012-03-28 16:36:31 +02:00
Jeremy Fitzhardinge	136d249ef7	x86/ioapic: Add io_apic_ops driver layer to allow interception Xen dom0 needs to paravirtualize IO operations to the IO APIC, so add a io_apic_ops for it to intercept. Do this as ops structure because there's at least some chance that another paravirtualized environment may want to intercept these. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: jwboyer@redhat.com Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/1332385090-18056-2-git-send-email-konrad.wilk@oracle.com [ Made all the affected code easier on the eyes ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-28 09:49:29 +02:00
Peter Zijlstra	bc758133ed	sched/x86/smp: Do not enable IRQs over calibrate_delay() We should not ever enable IRQs until we're fully set up. This opens up a window where interrupts can hit the cpu and interrupts can do wakeups, wakeups need state that isn't set-up yet, in particular this cpu isn't elegible to run tasks, so if any cpu-affine task that got created in CPU_UP_PREPARE manages to get a wakeup, its affinity mask will get broken and we'll run into lots of 'interesting' problems. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-yaezmlbriluh166tfkgni22m@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-27 14:50:10 +02:00
Ingo Molnar	7fd52392c5	Merge branch 'linus' into perf/urgent Merge reason: we need to fix a non-trivial merge conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-26 17:19:03 +02:00
Richard Weinberger	90e240142b	x86: Merge the x86_32 and x86_64 cpu_idle() functions Both functions are mostly identical. The differences are: - x86_32's cpu_idle() makes use of check_pgt_cache(), which is a nop on both x86_32 and x86_64. - x86_64's cpu_idle() uses enter/__exit_idle/(), on x86_32 these function are a nop. - In contrast to x86_32, x86_64 calls rcu_idle_enter/exit() in the innermost loop because idle notifications need RCU. Calling these function on x86_32 also in the innermost loop does not hurt. So we can merge both functions. Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: paulmck@linux.vnet.ibm.com Cc: josh@joshtriplett.org Cc: tj@kernel.org Link: http://lkml.kernel.org/r/1332709204-22496-1-git-send-email-richard@nod.at Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-26 03:16:07 +02:00
Linus Torvalds	ed2d265d12	The following text was taken from the original review request: "[RFC - PATCH 0/7] consolidation of BUG support code." https://lkml.org/lkml/2012/1/26/525 -- The changes shown here are to unify linux's BUG support under the one <linux/bug.h> file. Due to historical reasons, we have some BUG code in bug.h and some in kernel.h -- i.e. the support for BUILD_BUG in linux/kernel.h predates the addition of linux/bug.h, but old code in kernel.h wasn't moved to bug.h at that time. As a band-aid, kernel.h was including <asm/bug.h> to pseudo link them. This has caused confusion[1] and general yuck/WTF[2] reactions. Here is an example that violates the principle of least surprise: CC lib/string.o lib/string.c: In function 'strlcat': lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON' make[2]: *** [lib/string.o] Error 1 $ $ grep linux/bug.h lib/string.c #include <linux/bug.h> $ We've included <linux/bug.h> for the BUG infrastructure and yet we still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.] Ugh - very confusing for someone who is new to kernel development. With the above in mind, the goals of this changeset are: 1) find and fix any include/.h files that were relying on the implicit presence of BUG code. 2) find and fix any C files that were consuming kernel.h and hence relying on implicitly getting some/all BUG code. 3) Move the BUG related code living in kernel.h to <linux/bug.h> 4) remove the asm/bug.h from kernel.h to finally break the chain. During development, the order was more like 3-4, build-test, 1-2. But to ensure that git history for bisect doesn't get needless build failures introduced, the commits have been reorderd to fix the problem areas in advance. [1] https://lkml.org/lkml/2012/1/3/90 [2] https://lkml.org/lkml/2012/1/17/414 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPbNwpAAoJEOvOhAQsB9HWrqYP/A0t9VB0nK6e42F0OR2P14MZ GJFtf1B++wwioIrx+KSWSRfSur1C5FKhDbxLR3I/pvkAYl4+T4JvRdMG6xJwxyip CC1kVQQNDjWVVqzjz2x6rYkOffx6dUlw/ERyIyk+OzP+1HzRIsIrugMqbzGLlX0X y0v2Tbd0G6xg1DV8lcRdp95eIzcGuUvdb2iY2LGadWZczEOeSXx64Jz3QCFxg3aL LFU4oovsg8Nb7MRJmqDvHK/oQf5vaTm9WSrS0pvVte0msSQRn8LStYdWC0G9BPCS GwL86h/eLXlUXQlC5GpgWg1QQt5i2QpjBFcVBIG0IT5SgEPMx+gXyiqZva2KwbHu LKicjKtfnzPitQnyEV/N6JyV1fb1U6/MsB7ebU5nCCzt9Gr7MYbjZ44peNeprAtu HMvJ/BNnRr4Ha6nPQNu952AdASPKkxmeXFUwBL1zUbLkOX/bK/vy1ujlcdkFxCD7 fP3t7hghYa737IHk0ehUOhrE4H67hvxTSCKioLUAy/YeN1IcfH/iOQiCBQVLWmoS AqYV6ou9cqgdYoyila2UeAqegb+8xyubPIHt+lebcaKxs5aGsTg+r3vq5juMDAPs iwSVYUDcIw9dHer1lJfo7QCy3QUTRDTxh+LB9VlHXQICgeCK02sLBOi9hbEr4/H8 Ko9g8J3BMxcMkXLHT9ud =PYQT -----END PGP SIGNATURE----- Merge tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull <linux/bug.h> cleanup from Paul Gortmaker: "The changes shown here are to unify linux's BUG support under the one <linux/bug.h> file. Due to historical reasons, we have some BUG code in bug.h and some in kernel.h -- i.e. the support for BUILD_BUG in linux/kernel.h predates the addition of linux/bug.h, but old code in kernel.h wasn't moved to bug.h at that time. As a band-aid, kernel.h was including <asm/bug.h> to pseudo link them. This has caused confusion[1] and general yuck/WTF[2] reactions. Here is an example that violates the principle of least surprise: CC lib/string.o lib/string.c: In function 'strlcat': lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON' make[2]: ** [lib/string.o] Error 1 $ $ grep linux/bug.h lib/string.c #include <linux/bug.h> $ We've included <linux/bug.h> for the BUG infrastructure and yet we still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.] Ugh - very confusing for someone who is new to kernel development. With the above in mind, the goals of this changeset are: 1) find and fix any include/.h files that were relying on the implicit presence of BUG code. 2) find and fix any C files that were consuming kernel.h and hence relying on implicitly getting some/all BUG code. 3) Move the BUG related code living in kernel.h to <linux/bug.h> 4) remove the asm/bug.h from kernel.h to finally break the chain. During development, the order was more like 3-4, build-test, 1-2. But to ensure that git history for bisect doesn't get needless build failures introduced, the commits have been reorderd to fix the problem areas in advance. [1] https://lkml.org/lkml/2012/1/3/90 [2] https://lkml.org/lkml/2012/1/17/414" Fix up conflicts (new radeon file, reiserfs header cleanups) as per Paul and linux-next. tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: kernel.h: doesn't explicitly use bug.h, so don't include it. bug: consolidate BUILD_BUG_ON with other bug code BUG: headers with BUG/BUG_ON etc. need linux/bug.h bug.h: add include of it to various implicit C users lib: fix implicit users of kernel.h for TAINT_WARN spinlock: macroize assert_spin_locked to avoid bug.h dependency x86: relocate get/set debugreg fcns to include/asm/debugreg.	2012-03-24 10:08:39 -07:00
Thomas Gleixner	68fe7b23d5	x86: vdso: Put declaration before code Sigh, warnings are there for a reason. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: John Stultz <john.stultz@linaro.org>	2012-03-24 09:29:22 +01:00
Hugh Dickins	65c0ff4079	x86: Stop recursive fault in print_context_stack after stack overflow After printing out the first line of a stack backtrace, print_context_stack() calls print_ftrace_graph_addr() to check if it's making a graph of function calls, usually not the case. But unfortunate ordering of assignments causes this to oops if an earlier stack overflow corrupted threadinfo->task. Reorder to avoid that irritation. ( The fact that there was a stack overflow may often be more interesting than the stack that can now be shown; but integrating that information with this stacktrace is awkward, so leave it to overflow reporting. ) Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20120323225648.15DD5A033B@akpm.mtv.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-24 08:15:04 +01:00
Linus Torvalds	2fb9e96cad	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull additional x86 fixes from Peter Anvin: - address a long-standing bug related to when a kernel-spawned process gets a signal on an i386 kernel compiled without CONFIG_VM86. - fix the newly introduced build warning in arch/x86/boot. - fix a typo in the i386 system call table which affects building some libcs. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86-32: Fix endless loop when processing signals for kernel tasks x86, boot: Correct CFLAGS for hostprogs x86-32: Fix typo for mq_getsetattr in syscall table	2012-03-23 17:10:09 -07:00
Linus Torvalds	8e3ade251b	Merge branch 'akpm' (Andrew's patch-bomb) Merge second batch of patches from Andrew Morton: - various misc things - core kernel changes to prctl, exit, exec, init, etc. - kernel/watchdog.c updates - get_maintainer - MAINTAINERS - the backlight driver queue - core bitops code cleanups - the led driver queue - some core prio_tree work - checkpatch udpates - largeish crc32 update - a new poll() feature for the v4l guys - the rtc driver queue - fatfs - ptrace - signals - kmod/usermodehelper updates - coredump - procfs updates * emailed from Andrew Morton <akpm@linux-foundation.org>: (141 commits) seq_file: add seq_set_overflow(), seq_overflow() proc-ns: use d_set_d_op() API to set dentry ops in proc_ns_instantiate(). procfs: speed up /proc/pid/stat, statm procfs: add num_to_str() to speed up /proc/stat proc: speed up /proc/stat handling fs/proc/kcore.c: make get_sparsemem_vmemmap_info() static coredump: add VM_NODUMP, MADV_NODUMP, MADV_CLEAR_NODUMP coredump: remove VM_ALWAYSDUMP flag kmod: make __request_module() killable kmod: introduce call_modprobe() helper usermodehelper: ____call_usermodehelper() doesn't need do_exit() usermodehelper: kill umh_wait, renumber UMH_* constants usermodehelper: implement UMH_KILLABLE usermodehelper: introduce umh_complete(sub_info) usermodehelper: use UMH_WAIT_PROC consistently signal: zap_pid_ns_processes: s/SEND_SIG_NOINFO/SEND_SIG_FORCED/ signal: oom_kill_task: use SEND_SIG_FORCED instead of force_sig() signal: cosmetic, s/from_ancestor_ns/force/ in prepare_signal() paths signal: give SEND_SIG_FORCED more power to beat SIGNAL_UNKILLABLE Hexagon: use set_current_blocked() and block_sigmask() ...	2012-03-23 16:59:10 -07:00
Akinobu Mita	0b2f4d4d76	x86: use for_each_clear_bit_from() Use for_each_clear_bit() to iterate over all the cleared bit in a memory region. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-23 16:58:34 -07:00
Akinobu Mita	307b1cd7ec	bitops: rename for_each_set_bit_cont() in favor of analogous list.h function This renames for_each_set_bit_cont() to for_each_set_bit_from() because it is analogous to list_for_each_entry_from() in list.h rather than list_for_each_entry_continue(). This doesn't remove for_each_set_bit_cont() for now. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-23 16:58:33 -07:00
Andy Lutomirski	91ec87d57f	x86-64: Simplify and optimize vdso clock_gettime monotonic variants We used to store the wall-to-monotonic offset and the realtime base. It's faster to precompute the monotonic base. This is about a 3% speedup on Sandy Bridge for CLOCK_MONOTONIC. It's much more impressive for CLOCK_MONOTONIC_COARSE. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: John Stultz <john.stultz@linaro.org>	2012-03-23 16:49:33 -07:00
Linus Torvalds	475c77edf8	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci Pull PCI changes (including maintainer change) from Jesse Barnes: "This pull has some good cleanups from Bjorn and Yinghai, as well as some more code from Yinghai to better handle resource re-allocation when enabled. There's also a new initcall_debug feature from Arjan which will print out quirk timing information to help identify slow quirks for fixing or refinement (Yinghai sent in a few patches to do just that once the new debug code landed). Beyond that, I'm handing off PCI maintainership to Bjorn Helgaas. He's been a core PCI and Linux contributor for some time now, and has kindly volunteered to take over. I just don't feel I have the time for PCI review and work that it deserves lately (I've taken on some other projects), and haven't been as responsive lately as I'd like, so I approached Bjorn asking if he'd like to manage things. He's going to give it a try, and I'm confident he'll do at least as well as I have in keeping the tree managed, patches flowing, and keeping things stable." Fix up some fairly trivial conflicts due to other cleanups (mips device resource fixup cleanups clashing with list handling cleanup, ppc iseries removal clashing with pci_probe_only cleanup etc) * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (112 commits) PCI: Bjorn gets PCI hotplug too PCI: hand PCI maintenance over to Bjorn Helgaas unicore32/PCI: move <asm-generic/pci-bridge.h> include to asm/pci.h sparc/PCI: convert devtree and arch-probed bus addresses to resource powerpc/PCI: allow reallocation on PA Semi powerpc/PCI: convert devtree bus addresses to resource powerpc/PCI: compute I/O space bus-to-resource offset consistently arm/PCI: don't export pci_flags PCI: fix bridge I/O window bus-to-resource conversion x86/PCI: add spinlock held check to 'pcibios_fwaddrmap_lookup()' PCI / PCIe: Introduce command line option to disable ARI PCI: make acpihp use __pci_remove_bus_device instead PCI: export __pci_remove_bus_device PCI: Rename pci_remove_behind_bridge to pci_stop_and_remove_behind_bridge PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_device PCI: print out PCI device info along with duration PCI: Move "pci reassigndev resource alignment" out of quirks.c PCI: Use class for quirk for usb host controller fixup PCI: Use class for quirk for ti816x class fixup PCI: Use class for quirk for intel e100 interrupt fixup ...	2012-03-23 14:02:12 -07:00
Linus Torvalds	a20ae85aba	Fixes: - Fix KDB keyboard repeat scan codes and leaked keyboard events - Fix kernel crash with kdb_printf() for users who compile new kdb_printf()'s in early code - Return all segment registers to gdb on x86_64 Features: - KDB/KGDB hook the reboot notifier and end user can control if it stops, detaches or does nothing (updated docs as well) - Notify users who use CONFIG_DEBUG_RODATA to use hw breakpoints -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPbJ17AAoJEIciOldedpOjQeIP/AkUxQFJ7O4aLrLYHl62EHnh spkgkd+nBzIzcKyV73alkrVBaR2WE2822aAPQmAPBP8/X283DZJJjqgDCUNVI1Mf CIZ7g8AQHRnS+bAZmof5Jss4malZn4byLvG/cfpOivrsye+4A8MdrAKKM3pYWNVy 4xABkcEknVEEamdNEhHrcPd+xehretfw7+9mmU8hfjqHb/5cXFB7JwDcf4tF7ozT MDyN4xKtOn1W/ftQl0t6izksCUuPyqKzIfUyAy0j6AwTgsEavXu56S52T1UoB2ZI JcwLn/ZpN4eGCWVodY04R3jzaMtKFb6ImY40wsb5wl3CU3Ecy5syMU6z4fg3cvjH /KE6xWF61j4yiE5lzjeJVtKyxwalthzrr56qU2uEwrsEVmo3SOnW9sm0cwouqa7j /TAMlhZuGgbZGesFwdaUKI5OLGoki+pRQ0Gaq3TsbZwpPC5Bimkq0bIvruruKJCP QWVkEvb5TZgxCFS3AvniePOm7Hc2wD9zXB3OfN3o91pCom0ryDBIthcLlwhVeNCo Jd67pnJJNVULPF/1qVicZihKHxvG3DUb4E9pUcbgJplBke+isi+3eHOvnQrYFjIG iCamE9qvVbsQm/OFV8MOJ5mfPs9R+nb/jNzTO8JDmBc8AL7nRDV3AFGjW68x/KWT ERcqNEGJ4QuVAxfejq76 =SXu9 -----END PGP SIGNATURE----- Merge tag 'for_linus-3.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb Pull KGDB/KDB updates from Jason Wessel: "Fixes: - Fix KDB keyboard repeat scan codes and leaked keyboard events - Fix kernel crash with kdb_printf() for users who compile new kdb_printf()'s in early code - Return all segment registers to gdb on x86_64 Features: - KDB/KGDB hook the reboot notifier and end user can control if it stops, detaches or does nothing (updated docs as well) - Notify users who use CONFIG_DEBUG_RODATA to use hw breakpoints" * tag 'for_linus-3.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb: kdb: Add message about CONFIG_DEBUG_RODATA on failure to install breakpoint kdb: Avoid using dbg_io_ops until it is initialized kgdb,debug_core: add the ability to control the reboot notifier KDB: Fix usability issues relating to the 'enter' key. kgdb,debug-core,gdbstub: Hook the reboot notifier for debugger detach kgdb: Respect that flush op is optional kgdb: x86: Return all segment registers also in 64-bit mode	2012-03-23 09:29:44 -07:00
Alexander Gordeev	4da7072ad6	x86/io_apic: Move and reenable irq only when CONFIG_GENERIC_PENDING_IRQ=y This patch removes dead code from certain .config variations. When CONFIG_GENERIC_PENDING_IRQ=n irq move and reenable code is never get executed, nor do_unmask_irq variable updates its init value. Move the code under CONFIG_GENERIC_PENDING_IRQ macro. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Link: http://lkml.kernel.org/r/20120320141935.GA24806@dhcp-26-207.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-23 13:47:25 +01:00
Steffen Persvold	b7157acf42	x86/apic: Add separate apic_id_valid() functions for selected apic drivers As suggested by Suresh Siddha and Yinghai Lu: For x2apic pre-enabled systems, apic driver is set already early through early_acpi_boot_init()/early_acpi_process_madt()/ acpi_parse_madt()/default_acpi_madt_oem_check() path so that apic_id_valid() checking will be sufficient during MADT and SRAT parsing. For non-x2apic pre-enabled systems, all apic ids should be less than 255. This allows us to substitute the checks in arch/x86/kernel/acpi/boot.c::acpi_parse_x2apic() and arch/x86/mm/srat.c::acpi_numa_x2apic_affinity_init() with apic->apic_id_valid(). In addition we can avoid feigning the x2apic cpu feature in the NumaChip apic code. The following apic drivers have separate apic_id_valid() functions which will accept x2apic type IDs : x2apic_phys x2apic_cluster x2apic_uv_x apic_numachip Signed-off-by: Steffen Persvold <sp@numascale.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Daniel J Blueman <daniel@numascale-asia.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Jack Steiner <steiner@sgi.com> Link: http://lkml.kernel.org/r/1331925935-13372-1-git-send-email-sp@numascale.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-23 13:28:43 +01:00
Yinghai Lu	0b8b8078cb	x86: Fix excessive MSR print out when show_msr is not specified Dave found: \| During bootup, I now have 162 messages like this.. \| [ 0.227346] MSR0000001b: 00000000fee00900 \| [ 0.227465] MSR00000021: 0000000000000001 \| [ 0.227584] MSR0000002a: 00000000c1c81400 \| \| commit `21c3fcf3e3` looks suspect. \| It claims that it will only print these out if show_msr= is \| passed, but that doesn't seem to be the case. Fix it by changing to the version that checks the index. Reported-and-tested-by: Dave Jones <davej@redhat.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1332477103-4595-1-git-send-email-yinghai@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-23 09:55:00 +01:00
Peter Zijlstra	c7206205d0	perf: Fix mmap_page capabilities and docs Complete the syscall-less self-profiling feature and address all complaints, namely: - capabilities, so we can detect what is actually available at runtime Add a capabilities field to perf_event_mmap_page to indicate what is actually available for use. - on x86: RDPMC weirdness due to being 40/48 bits and not sign-extending properly. - ABI documentation as to how all this stuff works. Also improve the documentation for the new features. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vweaver1@eecs.utk.edu> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1332433596.2487.33.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-23 09:52:16 +01:00
Ingo Molnar	c5bc437702	Cleanups and fixes for perf/core: . Short term fix for 'diff' tool breakage related to perf.data files with multiple events. From Jiri Olsa . Cleanup for event id tracepoint reading routine, from Borislav Petkov . 32-bit compilation fixes from Jiri Olsa . Event parsing modifier assignment fixes from Jiri Olsa Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJPa24hAAoJENZQFvNTUqpARrEQAIXfsXG2X45twEFHJ0oOmvZI 7F28pMFM0OBJdSq/MKyO86+j5wtbtKtjCcVZfVm1ukkeD7KnKQv+iyBw1rEuE/dx IJIIS+zkCEAje7s64diNax/rpvZXp+k92pKKo/JGjalsVcw63VF5n0Ej3JA+n8Qp anKyJd2SJFogoljTzRnFnZ+zsbn5CWVwb3NVRhH8t+jfKKOsmw8Nq5ds5ysuRf8x +Xpm3YAsyKzAJMd05Uo5no+Ne4A5TukySiarZl3wiQ0TrwM9mVxQVAU5iVkAUZkb Bp1dgBOgG1RNudcRSgbq3NuO7TGFby6DqODgXvo1TzpLyixKbWUz3JI0mEya5xCL eBxJ2UTQpruBg+XJ3C0ys74TkMcQUeyxhqbwPu1WhMbP7cIJK1W1hJ728qWRG94a uxFMFlEfc0d9kxlsXVvxUumiyhEw5gPtwdbQf7fo2xpMPQmHCC3yS7DdMO6FBfUW oqPC7gb3aSTmHnKuYvwbT9EcOL27YHU+Y/4sRJ/QfohArgc4h60dUQrE9fWz7dXH /eCXDvMnBs2+z52dNLq/mkoQMP4iFzWqboDNn5GM4jI2LmXd/hxBGL6lb2Vzm8CI QYi5xC4E6VStSkJG2DTtvaYTclWOjo+ZrsGz7sDXQV3MgZ1wiZIWcGiPjeFBWU25 mHWkZT3of/MAbewmeklc =V7d9 -----END PGP SIGNATURE----- Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Cleanups and fixes for perf/core: . Short term fix for 'diff' tool breakage related to perf.data files with multiple events. From Jiri Olsa . Cleanup for event id tracepoint reading routine, from Borislav Petkov . 32-bit compilation fixes from Jiri Olsa . Event parsing modifier assignment fixes from Jiri Olsa Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-03-23 09:16:03 +01:00
Dmitry Adamushko	29a2e2836f	x86-32: Fix endless loop when processing signals for kernel tasks The problem occurs on !CONFIG_VM86 kernels [1] when a kernel-mode task returns from a system call with a pending signal. A real-life scenario is a child of 'khelper' returning from a failed kernel_execve() in ____call_usermodehelper() [ kernel/kmod.c ]. kernel_execve() fails due to a pending SIGKILL, which is the result of "kill -9 -1" (at least, busybox's init does it upon reboot). The loop is as follows: * syscall_exit_work: - work_pending: // start_of_the_loop - work_notify_sig: - do_notify_resume() - do_signal() - if (!user_mode(regs)) return; - resume_userspace // TIF_SIGPENDING is still set - work_pending // so we call work_pending => goto // start_of_the_loop More information can be found in another LKML thread: http://www.serverphorums.com/read.php?12,457826 [1] the problem was also seen on MIPS. Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Link: http://lkml.kernel.org/r/1332448765.2299.68.camel@dimm Cc: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@hack.frob.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-03-22 13:50:25 -07:00
Jan Kiszka	639077fb69	kgdb: x86: Return all segment registers also in 64-bit mode Even if the content is always 0, gdb expects us to return also ds, es, fs, and gs while in x86-64 mode. Do this to avoid ugly errors on "info registers". [jason.wessel@windriver.com: adjust NUMREGBYTES for two new regs] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>	2012-03-22 15:07:15 -05:00
Linus Torvalds	28f23d1f3b	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 "urgent" leftovers from Ingo Molnar: "Pending x86/urgent bits that were not high prio enough to warrant -rc-less v3.3-final inclusion." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, efi: Fix pointer math issue in handle_ramdisks() x86/ioapic: Add register level checks to detect bogus io-apic entries x86, mce: Fix rcu splat in drain_mce_log_buffer() x86, memblock: Move mem_hole_size() to .init	2012-03-22 09:44:50 -07:00
Linus Torvalds	2390481546	Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform changes from Ingo Molnar. Removes the Moorestown platform that nobody ever used. * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform: Move APIC ID validity check into platform APIC code x86/olpc/xo15/sci: Enable lid close wakeup control x86/geode/net5501: Add platform driver for Soekris Engineering net5501 x86/geode/alix2: Supplement driver to include GPIO button support x86/mid/powerbtn: Use MSIC read/write instead of ipc_scu x86/mid/thermal: Turn off thermistor x86/mid/thermal: Add msic_thermal alias x86/mid/thermal: Convert to use Intel MSIC API x86/mid/scu_ipc: Remove Moorestown support x86/mid: Kill off Moorestown x86/mrst: Add msic_thermal platform support x86/config: Select MSIC MFD driver on Intel Medfield platform x86/mid: Remove Intel Moorestown x86/mrst: Set ISA bus type for fake MP IRQs x86/ioapic: Use legacy_pic to set correct gsi-irq mapping	2012-03-22 09:43:22 -07:00
Linus Torvalds	754b980077	Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MCE changes from Ingo Molnar. * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Fix return value of mce_chrdev_read() when erst is disabled x86/mce: Convert static array of pointers to per-cpu variables x86/mce: Replace hard coded hex constants with symbolic defines x86/mce: Recognise machine check bank signature for data path error x86/mce: Handle "action required" errors x86/mce: Add mechanism to safely save information in MCE handler x86/mce: Create helper function to save addr/misc when needed HWPOISON: Add code to handle "action required" errors. HWPOISON: Clean up memory_failure() vs. __memory_failure()	2012-03-22 09:42:04 -07:00
Linus Torvalds	35cb8d9e18	Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/fpu changes from Ingo Molnar. * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: i387: Split up <asm/i387.h> into exported and internal interfaces i387: Uninline the generic FP helpers that we expose to kernel modules	2012-03-22 09:41:22 -07:00
Linus Torvalds	f06fc0c0de	Merge branch 'x86-eficross-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/eficross (booting 32/64-bit kernel from 64/32-bit EFI) from Ingo Molnar * 'x86-eficross-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, efi: Allow basic init with mixed 32/64-bit efi/kernel x86, efi: Add basic error handling x86, efi: Cleanup config table walking x86, efi: Convert printk to pr_*() x86, efi: Refactor efi_init() a bit	2012-03-22 09:31:31 -07:00
Linus Torvalds	4c64616bb5	Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/debug changes from Ingo Molnar. * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Fix section warnings x86-64: Fix CFI data for common_interrupt() x86: Properly _init-annotate NMI selftest code x86/debug: Fix/improve the show_msr=<cpus> debug print out	2012-03-22 09:30:39 -07:00
Linus Torvalds	c5c7fb8fbd	Merge branches 'x86-cpu-for-linus', 'x86-boot-for-linus', 'x86-cpufeature-for-linus', 'x86-process-for-linus' and 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull trivial x86 branches from Ingo Molnar: small one-liners to fix up details. * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Remove some noise from boot log when starting cpus * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, boot: Fix port argument to inl() function * 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, cpufeature: Add CPU features from Intel document 319433-012A * 'x86-process-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86_64: Record stack pointer before task execution begins * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/UV: Lower UV rtc clocksource rating	2012-03-22 09:28:15 -07:00
Linus Torvalds	e17fdf5c67	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/asm changes from Ingo Molnar * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Include probe_roms.h in probe_roms.c x86/32: Print control and debug registers for kerenel context x86: Tighten dependencies of CPU_SUP_*_32 x86/numa: Improve internode cache alignment x86: Fix the NMI nesting comments x86-64: Improve insn scheduling in SAVE_ARGS_IRQ x86-64: Fix CFI annotations for NMI nesting code bitops: Add missing parentheses to new get_order macro bitops: Optimise get_order() bitops: Adjust the comment on get_order() to describe the size==0 case x86/spinlocks: Eliminate TICKET_MASK x86-64: Handle byte-wise tail copying in memcpy() without a loop x86-64: Fix memcpy() to support sizes of 4Gb and above x86-64: Fix memset() to support sizes of 4Gb and above x86-64: Slightly shorten copy_page()	2012-03-22 09:13:24 -07:00
Len Brown	e840dfe334	Merge branch 'stable/for-x86-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into tboot	2012-03-22 01:31:09 -04:00
Xiao Guangrong	b716ad953a	mm: search from free_area_cache for the bigger size If the required size is bigger than cached_hole_size it is better to search from free_area_cache - it is easier to get a free region, specifically for the 64 bit process whose address space is large enough Do it just as hugetlb_get_unmapped_area_topdown() in arch/x86/mm/hugetlbpage.c Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Hillf Danton <dhillf@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-21 17:54:56 -07:00
Andrea Arcangeli	1a5a9906d4	mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode In some cases it may happen that pmd_none_or_clear_bad() is called with the mmap_sem hold in read mode. In those cases the huge page faults can allocate hugepmds under pmd_none_or_clear_bad() and that can trigger a false positive from pmd_bad() that will not like to see a pmd materializing as trans huge. It's not khugepaged causing the problem, khugepaged holds the mmap_sem in write mode (and all those sites must hold the mmap_sem in read mode to prevent pagetables to go away from under them, during code review it seems vm86 mode on 32bit kernels requires that too unless it's restricted to 1 thread per process or UP builds). The race is only with the huge pagefaults that can convert a pmd_none() into a pmd_trans_huge(). Effectively all these pmd_none_or_clear_bad() sites running with mmap_sem in read mode are somewhat speculative with the page faults, and the result is always undefined when they run simultaneously. This is probably why it wasn't common to run into this. For example if the madvise(MADV_DONTNEED) runs zap_page_range() shortly before the page fault, the hugepage will not be zapped, if the page fault runs first it will be zapped. Altering pmd_bad() not to error out if it finds hugepmds won't be enough to fix this, because zap_pmd_range would then proceed to call zap_pte_range (which would be incorrect if the pmd become a pmd_trans_huge()). The simplest way to fix this is to read the pmd in the local stack (regardless of what we read, no need of actual CPU barriers, only compiler barrier needed), and be sure it is not changing under the code that computes its value. Even if the real pmd is changing under the value we hold on the stack, we don't care. If we actually end up in zap_pte_range it means the pmd was not none already and it was not huge, and it can't become huge from under us (khugepaged locking explained above). All we need is to enforce that there is no way anymore that in a code path like below, pmd_trans_huge can be false, but pmd_none_or_clear_bad can run into a hugepmd. The overhead of a barrier() is just a compiler tweak and should not be measurable (I only added it for THP builds). I don't exclude different compiler versions may have prevented the race too by caching the value of pmd on the stack (that hasn't been verified, but it wouldn't be impossible considering pmd_none_or_clear_bad, pmd_bad, pmd_trans_huge, pmd_none are all inlines and there's no external function called in between pmd_trans_huge and pmd_none_or_clear_bad). if (pmd_trans_huge(pmd)) { if (next-addr != HPAGE_PMD_SIZE) { VM_BUG_ON(!rwsem_is_locked(&tlb->mm->mmap_sem)); split_huge_page_pmd(vma->vm_mm, pmd); } else if (zap_huge_pmd(tlb, vma, pmd, addr)) continue; /* fall through / } if (pmd_none_or_clear_bad(pmd)) Because this race condition could be exercised without special privileges this was reported in CVE-2012-1179. The race was identified and fully explained by Ulrich who debugged it. I'm quoting his accurate explanation below, for reference. ====== start quote ======= mapcount 0 page_mapcount 1 kernel BUG at mm/huge_memory.c:1384! At some point prior to the panic, a "bad pmd ..." message similar to the following is logged on the console: mm/memory.c:145: bad pmd ffff8800376e1f98(80000000314000e7). The "bad pmd ..." message is logged by pmd_clear_bad() before it clears the page's PMD table entry. 143 void pmd_clear_bad(pmd_t pmd) 144 { -> 145 pmd_ERROR(pmd); 146 pmd_clear(pmd); 147 } After the PMD table entry has been cleared, there is an inconsistency between the actual number of PMD table entries that are mapping the page and the page's map count (_mapcount field in struct page). When the page is subsequently reclaimed, __split_huge_page() detects this inconsistency. 1381 if (mapcount != page_mapcount(page)) 1382 printk(KERN_ERR "mapcount %d page_mapcount %d\n", 1383 mapcount, page_mapcount(page)); -> 1384 BUG_ON(mapcount != page_mapcount(page)); The root cause of the problem is a race of two threads in a multithreaded process. Thread B incurs a page fault on a virtual address that has never been accessed (PMD entry is zero) while Thread A is executing an madvise() system call on a virtual address within the same 2 MB (huge page) range. virtual address space .---------------------. \| \| \| \| .-\|---------------------\| \| \| \| \| \| \|<-- B(fault) \| \| \| 2 MB \| \|/////////////////////\|-. huge < \|/////////////////////\| > A(range) page \| \|/////////////////////\|-' \| \| \| \| \| \| '-\|---------------------\| \| \| \| \| '---------------------' - Thread A is executing an madvise(..., MADV_DONTNEED) system call on the virtual address range "A(range)" shown in the picture. sys_madvise // Acquire the semaphore in shared mode. down_read(&current->mm->mmap_sem) ... madvise_vma switch (behavior) case MADV_DONTNEED: madvise_dontneed zap_page_range unmap_vmas unmap_page_range zap_pud_range zap_pmd_range // // Assume that this huge page has never been accessed. // I.e. content of the PMD entry is zero (not mapped). // if (pmd_trans_huge(pmd)) { // We don't get here due to the above assumption. } // // Assume that Thread B incurred a page fault and .---------> // sneaks in here as shown below. \| // \| if (pmd_none_or_clear_bad(pmd)) \| { \| if (unlikely(pmd_bad(pmd))) \| pmd_clear_bad \| { \| pmd_ERROR \| // Log "bad pmd ..." message here. \| pmd_clear \| // Clear the page's PMD entry. \| // Thread B incremented the map count \| // in page_add_new_anon_rmap(), but \| // now the page is no longer mapped \| // by a PMD entry (-> inconsistency). \| } \| } \| v - Thread B is handling a page fault on virtual address "B(fault)" shown in the picture. ... do_page_fault __do_page_fault // Acquire the semaphore in shared mode. down_read_trylock(&mm->mmap_sem) ... handle_mm_fault if (pmd_none(pmd) && transparent_hugepage_enabled(vma)) // We get here due to the above assumption (PMD entry is zero). do_huge_pmd_anonymous_page alloc_hugepage_vma // Allocate a new transparent huge page here. ... __do_huge_pmd_anonymous_page ... spin_lock(&mm->page_table_lock) ... page_add_new_anon_rmap // Here we increment the page's map count (starts at -1). atomic_set(&page->_mapcount, 0) set_pmd_at // Here we set the page's PMD entry which will be cleared // when Thread A calls pmd_clear_bad(). ... spin_unlock(&mm->page_table_lock) The mmap_sem does not prevent the race because both threads are acquiring it in shared mode (down_read). Thread B holds the page_table_lock while the page's map count and PMD table entry are updated. However, Thread A does not synchronize on that lock. ====== end quote ======= [akpm@linux-foundation.org: checkpatch fixes] Reported-by: Ulrich Obergfell <uobergfe@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Jones <davej@redhat.com> Acked-by: Larry Woodman <lwoodman@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: <stable@vger.kernel.org> [2.6.38+] Cc: Mark Salter <msalter@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-21 17:54:54 -07:00
Linus Torvalds	c207f3a431	Generialize powerpc's irq_host as irq_domain This branch takes the PowerPC irq_host infrastructure (reverse mapping from Linux IRQ numbers to hardware irq numbering), generalizes it, renames it to irq_domain, and makes it available to all architectures. Originally the plan has been to create an all-new irq_domain implementation which addresses some of the powerpc shortcomings such as not handling 1:1 mappings well, but doing that proved to be far more difficult and invasive than generalizing the working code and refactoring it in-place. So, this branch rips out the 'new' irq_domain and replaces it with the modified powerpc version (in a fully bisectable way of course). It converts all users over to the new API and makes irq_domain selectable on any architecture. No architecture is forced to enable irq_domain, but the infrastructure is required for doing OpenFirmware style irq translations. It will even work on SPARC even though SPARC has it's own mechanism for translating irqs at boot time. MIPS, microblaze, embedded x86 and c6x are converted too. The resulting irq_domain code is probably still too verbose and can be optimized more, but that can be done incrementally and is a task for follow-on patches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPZ1yiAAoJEEFnBt12D9kB4yIQAJvCfTPL65sCYVD6i9RnVHtR ahwddtd0AtT+UYLU8Xg2fZgVi6cmupDGnqkBixzZD3xxSTERqm7Snqa0ugklfeAi B6Zqf/K17H5hJNaoQ3fkNauow8m7ZYOeEH2vVUvkb3woWS9Wm7OGd+BvcIBgYSGe Aaoumhu7kDxFkii0qz3x/+kvsb6DRp2HtSPWj+APL/kNjdiO4JBOihtcc/lX6d47 bsZLiEMzHUFV4ApJNwqmfDnf54oMrHmrRJxgQHIMjeJC5or9I3Do8wDGe/aTF5xO 5GVpxCQsTlJMjTBWlAFtpTwCJB6y76EHQrHc7WzLlq8OJSsxApOke8M0BzXFrfMy CU7UUpTvNZTLpZibLCEQKemv1+oNOkfFylsHxfek2MCqx0W6W4FHEGV3qE/GtgV9 +vurA9hNNp7VM0FGRGigcUr3woYdHLdEVQrlnL7Z9AgBu1W44MZLaai7iRVZOeCT ZQ9++v2PJJ8vHT8kdkgTdiRpnEhmv84MX/GBT7ilWFEMIVeT5zhGkIBojzNgyzGc 7cvermmM0P8h+unkDgmzmSbDxo0PboqVKeoO71AOBhA6MmR9iom7XkuNdHhoOwy2 4A5xT1srbhJDbuv15BBREBV24TywpZ4a1+4nwQT4L1fXe+HfCxeEWexGcKQMRcIt dAelOHTQ+ZGkOKvXeW05 =ruGA -----END PGP SIGNATURE----- Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6 Pull irq_domain support for all architectures from Grant Likely: "Generialize powerpc's irq_host as irq_domain This branch takes the PowerPC irq_host infrastructure (reverse mapping from Linux IRQ numbers to hardware irq numbering), generalizes it, renames it to irq_domain, and makes it available to all architectures. Originally the plan has been to create an all-new irq_domain implementation which addresses some of the powerpc shortcomings such as not handling 1:1 mappings well, but doing that proved to be far more difficult and invasive than generalizing the working code and refactoring it in-place. So, this branch rips out the 'new' irq_domain and replaces it with the modified powerpc version (in a fully bisectable way of course). It converts all users over to the new API and makes irq_domain selectable on any architecture. No architecture is forced to enable irq_domain, but the infrastructure is required for doing OpenFirmware style irq translations. It will even work on SPARC even though SPARC has it's own mechanism for translating irqs at boot time. MIPS, microblaze, embedded x86 and c6x are converted too. The resulting irq_domain code is probably still too verbose and can be optimized more, but that can be done incrementally and is a task for follow-on patches." * tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6: (31 commits) dt: fix twl4030 for non-dt compile on x86 mfd: twl-core: Add IRQ_DOMAIN dependency devicetree: Add empty of_platform_populate() for !CONFIG_OF_ADDRESS (sparc) irq_domain: Centralize definition of irq_dispose_mapping() irq_domain/mips: Allow irq_domain on MIPS irq_domain/x86: Convert x86 (embedded) to use common irq_domain ppc-6xx: fix build failure in flipper-pic.c and hlwd-pic.c irq_domain/microblaze: Convert microblaze to use irq_domains irq_domain/powerpc: Replace custom xlate functions with library functions irq_domain/powerpc: constify irq_domain_ops irq_domain/c6x: Use library of xlate functions irq_domain/c6x: constify irq_domain structures irq_domain/c6x: Convert c6x to use generic irq_domain support. irq_domain: constify irq_domain_ops irq_domain: Create common xlate functions that device drivers can use irq_domain: Remove irq_domain_add_simple() irq_domain: Remove 'new' irq_domain in favour of the ppc one mfd: twl-core.c: Fix the number of interrupts managed by twl4030 of/address: add empty static inlines for !CONFIG_OF irq_domain: Add support for base irq and hwirq in legacy mappings ...	2012-03-21 10:27:19 -07:00
Linus Torvalds	c7c66c0cb0	Power management updates for 3.4 Assorted extensions and fixes including: * Introduction of early/late suspend/hibernation device callbacks. * Generic PM domains extensions and fixes. * devfreq updates from Axel Lin and MyungJoo Ham. * Device PM QoS updates. * Fixes of concurrency problems with wakeup sources. * System suspend and hibernation fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQIcBAABAgAGBQJPZww5AAoJEKhOf7ml8uNsiBYQAL9YGso7KypZhLspNxvAKuZr iHyme2F7OdOiUfo40DVH5tRuEsQvLOl0S+9ukWLrzQotKBsMfym05jtbGN9m6Ygh Z793sx3eRI3mltekJ9yrOxH6BOBDMWMkwY8ztU/X5aYDNirgJ/qtAjSK4BvWXBrz APeaUReVnLdaNP8SnhHfne/KPsHk++NKZvAAva7E6RwtZn4KV6bfiBPGb8yvY8pP m4cg1S5QEduMy+zQJ8+IlEHR91bt9spUyRwbhw6ZHCNzNeu4iEZT8DVt1O1sIRbO LsNcClqsd40nr781SoF8N9GmGUxlUDr46bS3FSsDkYzn8uyxGEsv00edJZtPwIm5 7nPuYat3Ke1YsON0Kcd/wkBGXqw/Rjfp3F1bnHjpVx/0oM/6MPrFNnIwvpHspejG kN3770idYJ17dLckhcsbYsLdy8yirITILDzvHT0AAaZ9z4Lr9Pm56WwFZLyb/lhR 2cqK8Bb8W9YvcVsKV8YqkyBVrygWMe+c56KoAoUBiSNxvW6LphmXFBj5QiFMs8s8 Xh8H7xU96FKbpNMIAZ1+bpI4zgulQG4xPXI9pKbhMfjaMUgj2zQeO8/t0WlB1M0z +kEUcYHJnXrRrObQuHEFXZdIjy/E0fdUboMIrlLt0gm97OxnG6imPseQp6/leQkC t+L4Aq6TOUofUU86d4cI =IGhc -----END PGP SIGNATURE----- Merge tag 'pm-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates for 3.4 from Rafael Wysocki: "Assorted extensions and fixes including: * Introduction of early/late suspend/hibernation device callbacks. * Generic PM domains extensions and fixes. * devfreq updates from Axel Lin and MyungJoo Ham. * Device PM QoS updates. * Fixes of concurrency problems with wakeup sources. * System suspend and hibernation fixes." * tag 'pm-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (43 commits) PM / Domains: Check domain status during hibernation restore of devices PM / devfreq: add relation of recommended frequency. PM / shmobile: Make MTU2 driver use pm_genpd_dev_always_on() PM / shmobile: Make CMT driver use pm_genpd_dev_always_on() PM / shmobile: Make TMU driver use pm_genpd_dev_always_on() PM / Domains: Introduce "always on" device flag PM / Domains: Fix hibernation restore of devices, v2 PM / Domains: Fix handling of wakeup devices during system resume sh_mmcif / PM: Use PM QoS latency constraint tmio_mmc / PM: Use PM QoS latency constraint PM / QoS: Make it possible to expose PM QoS latency constraints PM / Sleep: JBD and JBD2 missing set_freezable() PM / Domains: Fix include for PM_GENERIC_DOMAINS=n case PM / Freezer: Remove references to TIF_FREEZE in comments PM / Sleep: Add more wakeup source initialization routines PM / Hibernate: Enable usermodehelpers in hibernate() error path PM / Sleep: Make __pm_stay_awake() delete wakeup source timers PM / Sleep: Fix race conditions related to wakeup source timer function PM / Sleep: Fix possible infinite loop during wakeup source destruction PM / Hibernate: print physical addresses consistently with other parts of kernel ...	2012-03-21 10:15:51 -07:00

... 3 4 5 6 7 ...

8573 commits