linux

q3k/linux

Author	SHA1	Message	Date
Paolo Ciarrocchi	209b580fd8	x86: coding style fixes to arch/x86/lib/strstr_32.c Before: total: 3 errors, 0 warnings, 31 lines checked After: total: 0 errors, 0 warnings, 31 lines checked paolo@paolo-desktop:~/linux.trees.git$ md5sum /tmp/strstr_32.o.* c96006ec3387862e5bacb139207a3098 /tmp/strstr_32.o.after c96006ec3387862e5bacb139207a3098 /tmp/strstr_32.o.before Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 16:53:24 +02:00
Paolo Ciarrocchi	2070dae10f	x86: coding style fixes to arch/x86/kernel/bios_uv.c paolo@paolo-desktop:~/linux.trees.git$ md5sum /tmp/bios_uv.o.* 9afe794594831166704744184e192ed8 /tmp/bios_uv.o.after 9afe794594831166704744184e192ed8 /tmp/bios_uv.o.before Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 16:53:24 +02:00
Paolo Ciarrocchi	020878ac42	x86: coding style fixes to arch/x86/boot/compressed/misc.c Before: total: 4 errors, 6 warnings, 439 lines checked After: total: 1 errors, 5 warnings, 441 lines checked Before -#include <asm/io.h> +#include <linux/io.h> paolo@paolo-desktop:~/linux.trees.git$ md5sum /tmp/misc.o.* 8b2394e1fe519a9542e9a7e3e7b69c39 /tmp/misc.o.after 8b2394e1fe519a9542e9a7e3e7b69c39 /tmp/misc.o.before After -#include <asm/io.h> +#include <linux/io.h> paolo@paolo-desktop:~/linux.trees.git$ md5sum /tmp/misc.o.* 59a2d264284be5e72b5af4f3a8ccfb47 /tmp/misc.o.after 8b2394e1fe519a9542e9a7e3e7b69c39 /tmp/misc.o.before Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 16:53:23 +02:00
Marcin Slusarz	9744f5a328	x86, acpi: cleanup, temp_stack is used only when CONFIG_SMP is set fix: arch/x86/kernel/acpi/sleep.c:24: warning: 'temp_stack' defined but not used [ Sven Wegener <sven.wegener@stealer.net>: fix build bug ] Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 16:41:21 +02:00
Li Zefan	2bd455dbfe	x86: remove nesting CONFIG_HOTPLUG_CPU prefill_possible_map() is defined inside CONFIG_HOTPLUG_CPU, so the nesting CONFIG_HOTPLUG_CPU is just redundant. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 16:33:28 +02:00
Ingo Molnar	1a10390708	Merge branch 'linus' into x86/cpu	2008-08-15 16:16:15 +02:00
Ingo Molnar	8bb851900f	x86, nmi: clean UP NMI watchdog failure message clean up the failure message - and redirect people to bugzilla instead of lkml. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 15:35:31 +02:00
Aristeu Rozanski	1563666844	x86, NMI: fix watchdog failure message > it just won't work at boot time - the second logic unit will be stuck: > > Booting processor 1/2 APIC 0x1 > Initializing CPU#1 > Calibrating delay using timer specific routine.. 5586.12 BogoMIPS (lpj=2793063) > CPU: Trace cache: 12K uops, L1 D cache: 16K > CPU: L2 cache: 1024K > CPU: Physical Processor ID: 0 > CPU: Processor Core ID: 1 > CPU1: Thermal monitoring enabled (TM1) > Intel(R) Pentium(R) D CPU 2.80GHz stepping 04 > Brought up 2 CPUs > testing NMI watchdog ... <4>WARNING: CPU#1: NMI appears to be stuck (0->0)! while at it... - fix that newline Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: jvillalo@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 15:34:59 +02:00
Hugh Dickins	a06de63000	x86: fix /proc/meminfo DirectMap Do we actually want these DirectMap lines in the x86 /proc/meminfo? I can see they're interesting to CPA developers and TLB optimizers, but they don't fit its usual "where has all my memory gone?" usage. If they are to stay, here are some fixes. 1. On x86_32 without PAE, they're not 2M but 4M pages: no need to mess with the internal enum, but show the right name to users. 2. Many machines can never show anything but 0 for DirectMap1G, so suppress that line unless direct_gbpages are really enabled. 3. The unit in /proc/meminfo is kB not number of pages: HugePages messed that up, but they're an example to regret not to follow. 4. Once we use kB, it's easy to see that 1GB has gone missing (which explains why CONFIG_CPA_DEBUG=y soon wraps DirectMap2M negative): because head_64.S's level2_ident_pgt entries were not counted. My fix is not ideal, but works for more and for less than 1G, and avoids interfering with early bootup pagetable contortions. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 15:27:55 +02:00
Pavel Machek	04b69447f7	arch/x86/Kconfig: clean up, experimental adjustement Adjust experimental tags in Kconfig, update config to notice that i386/x86_64 is now single architecture. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 14:06:54 +02:00
Mark Langsdorf	394a15051c	x86: invalidate caches before going into suspend When a CPU core is shut down, all of its caches need to be flushed to prevent stale data from causing errors if the core is resumed. Current Linux suspend code performs an assignment after the flush, which can add dirty data back to the cache. On some AMD platforms, additional speculative reads have caused crashes on resume because of this dirty data. Relocate the cache flush to be the very last thing done before halting. Tie into an assembly line so the compile will not reorder it. Add some documentation explaining what is going on and why we're doing this. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Acked-by: Mark Borden <mark.borden@amd.com> Acked-by: Michael Hohmuth <michael.hohmuth@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 14:04:30 +02:00
Aristeu Rozanski	dcc9841668	x86, perfctr: don't use CCCR_OVF_PMI1 on Pentium 4Ds Currently, setup_p4_watchdog() use CCCR_OVF_PMI1 to enable the counter overflow interrupts to the second logical core. But this bit doesn't work on Pentium 4 Ds (model 4, stepping 4) and this patch avoids its use on these processors. Tested on 4 different machines that have this specific model with success. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: jvillalovos@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 13:58:33 +02:00
Ingo Molnar	975439fe73	Merge branch 'x86/amd-iommu' into x86/urgent	2008-08-15 13:57:32 +02:00
Joerg Roedel	129d6aba44	x86, AMD IOMMU: initialize dma_ops after sysfs registration If sysfs registration fails all memory used by IOMMU is freed. This happens after dma_ops initialization and the functions will access the freed memory then. Fix this by initializing dma_ops after the sysfs registration. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 13:56:56 +02:00
Joerg Roedel	8a456695c5	x86m AMD IOMMU: cleanup: replace LOW_U32 macro with generic lower_32_bits Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 13:56:56 +02:00
Joerg Roedel	9f5f5fb35d	x86, AMD IOMMU: initialize device table properly This patch adds device table initializations which forbids memory accesses for devices per default and disables all page faults. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 13:56:54 +02:00
Joerg Roedel	519c31bacf	x86, AMD IOMMU: use status bit instead of memory write-back for completion wait Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 13:56:46 +02:00
Dave Jones	ef31023743	x86: silence mmconfig printk There's so much broken mmconfig hardware/bios'es out there, that classing this as an error seems a little extreme. Lower its priority to KERN_INFO so that it isn't so noisy when booting with 'quiet' Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 13:52:39 +02:00
Darrick J. Wong	967060d00d	x86, msr: fix NULL pointer deref due to msr_open on nonexistent CPUs msr_open tests for someone trying to open a device for a nonexistent CPU. However, the function always returns 0, not ret like it should, hence userspace can BUG the kernel trivially. This bug was introduced by the cdev lock_kernel pushdown patch last May. The BUG can be reproduced with these commands: # mknod fubar c 202 8 <-- pick a number less than NR_CPUS that is not the number of an online CPU # cat fubar Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 13:38:30 +02:00
Thomas Gleixner	a6825f1c1f	x86: hpet: workaround SB700 BIOS AMD SB700 based systems with spread spectrum enabled use a SMM based HPET emulation to provide proper frequency setting. The SMM code is initialized with the first HPET register access and takes some time to complete. During this time the config register reads 0xffffffff. We check for max. 1000 loops whether the config register reads a non 0xffffffff value to make sure that HPET is up and running before we go further. A counting loop is safe, as the HPET access takes thousands of CPU cycles. On non SB700 based machines this check is only done once and has no side effects. Based on a quirk patch from: crane cai <crane.cai@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-08-14 13:23:45 +02:00
Ingo Molnar	8d7ccaa545	Merge commit 'v2.6.27-rc3' into x86/prototypes Conflicts: include/asm-x86/dma-mapping.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-14 12:19:59 +02:00
Yinghai Lu	a58f03b075	x86: check bigsmp in smp_sanity_check instead of cpu_up clear bits for cpu nr > 8. This allows us to boot the full range of possible CPUs that the supported APIC model will allow. Previously we'd hang or boot up with less than 8 CPUs. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Tested-by: Jeff Chua <jeff.chua.linux@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-14 11:35:53 +02:00
Yinghai Lu	858f774733	x86: don't call e820_regiter_active_regions if out of range on node so we don't get warning on 32bit system with 64g RAM or more Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-14 11:35:52 +02:00
Max Krasnyansky	23b49c19f6	x86: resurrect proper handling of maxcpus= kernel option (v2) For some reason we had two parsers registered for maxcpus=. One in init/main.c and another in arch/x86/smpboot.c. So I nuked the one in arch/x86. Also 64-bit kernels used to handle maxcpus= as documented in Documentation/cpu-hotplug.txt. CPUs with 'id > maxcpus' are initialized but not booted. 32-bit version for some reason ignored them even though all the infrastructure for booting them later is there. In the current mainline both 64 and 32 bit versions are broken. This patch restores the correct behaviour. I've tested x86_64 version on 4- and 8- way Core2 and 2-way Opteron based machines. Various config combinations SMP, !SMP, CPU_HOTPLUG, !CPU_HOTPLUG. Booted with maxcpus=1 and maxcpus=4, etc. Everything is working as expected. So far we've received two reports from different people confirming that 32-bit version also works fine, both on dual core laptops and 16way server machines. [v2: This version fixes visws breakage pointed out by Ingo.] Signed-off-by: Max Krasnyansky <maxk@qualcomm.com> Cc: lizf@cn.fujitsu.com Cc: jeff.chua.linux@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-14 11:18:08 +02:00
Ingo Molnar	3167761965	Merge branch 'x86/fpu' into x86/urgent	2008-08-14 11:18:08 +02:00
Suresh Siddha	f65bc214e0	x86, xsave: use BUG_ON() instead of BUILD_BUG_ON() All these structure sizes are runtime determined. So use a runtime bug check. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-14 10:56:07 +02:00
Suresh Siddha	ed40595805	x86, xsave: clear the user buffer before doing fxsave/xsave fxsave/xsave instructions will not touch all the bytes in the fxsave/xsave frame. Clear the user buffer before doing fxsave/xsave directly to user buffer during the sigcontext setup. This is essentially needed in the context of xsave(for example, some of the fields in the xsave header are not touched by the xsave and defined as must be zero). This will also present uniform and clean context to the user (from which user can safely do fxrstor/xrstor). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-14 10:56:06 +02:00
Suresh Siddha	ee2b92a820	x86, xsave: remove the redundant access_ok() in setup_rt_frame() save_i387_xstate() is already doing the required access_ok(). Remove the redundant access_ok() before it. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-14 10:56:06 +02:00
Ingo Molnar	d4439087d3	Merge commit 'v2.6.27-rc3' into x86/xsave Conflicts: arch/x86/kernel/genapic_64.c include/asm-x86/kvm_host.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-14 10:55:26 +02:00
H. Peter Anvin	c2dcfde827	x86: cleanup for setup code crashes during IST probe Clean up the code for crashes during SpeedStep probing on older machines. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-14 00:13:52 +02:00
Arjan van de Ven	875e40b975	x86: use WARN() in arch/x86/mm/pageattr.c Use WARN() instead of a printk+WARN_ON() pair; this way the message becomes part of the warning section for better reporting/collection. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: akpm@linux-foundation.org Cc: arjan@linux.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-13 19:05:39 +02:00
John Keller	a726c6009e	x86: allow MMCONFIG above 4GB on x86_64 SGI UV will have MMCFG base addresses that are greater than 4GB (32 bits). v2: Use CONFIG_RESOURCES_64BIT instead of CONFIG_X86_64. v3: Create a flag, that is set by platform specific code, to disable the > 4GB check. Signed-off-by: John Keller <jpk@sgi.com> Cc: jpk@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-13 17:48:13 +02:00
Marcin Slusarz	6b3560229d	x86: fix 2 section mismatch warnings - find_and_reserve_crashkernel WARNING: vmlinux.o(.text+0xcd1f): Section mismatch in reference from the function find_and_reserve_crashkernel() to the function .init.text:find_e820_area() The function find_and_reserve_crashkernel() references the function __init find_e820_area(). This is often because find_and_reserve_crashkernel lacks a __init annotation or the annotation of find_e820_area is wrong. WARNING: vmlinux.o(.text+0xcd38): Section mismatch in reference from the function find_and_reserve_crashkernel() to the function .init.text:reserve_bootmem_generic() The function find_and_reserve_crashkernel() references the function __init reserve_bootmem_generic(). This is often because find_and_reserve_crashkernel lacks a __init annotation or the annotation of reserve_bootmem_generic is wrong. find_and_reserve_crashkernel is called from __init function (reserve_crashkernel) and calls 2 __init functions (find_e820_area, reserve_bootmem_generic), so mark it __init Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-13 17:48:12 +02:00
Marcin Slusarz	c9d08f0860	x86: fix 2 section mismatch warnings - map_high() WARNING: vmlinux.o(.text+0x14cf8): Section mismatch in reference from the function map_high() to the function .init.text:init_extra_mapping_uc() The function map_high() references the function __init init_extra_mapping_uc(). This is often because map_high lacks a __init annotation or the annotation of init_extra_mapping_uc is wrong. WARNING: vmlinux.o(.text+0x14d05): Section mismatch in reference from the function map_high() to the function .init.text:init_extra_mapping_wb() The function map_high() references the function __init init_extra_mapping_wb(). This is often because map_high lacks a __init annotation or the annotation of init_extra_mapping_wb is wrong. map_high is called only from __init functions (map__high) and calls 2 __init_functions (init_extra_mapping_) Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-13 13:09:49 +02:00
Ingo Molnar	a12e61df4f	Merge commit 'v2.6.27-rc3' into x86/urgent	2008-08-13 13:08:47 +02:00
Joerg Roedel	7b27718bdb	x86: fix setup code crashes on my old 486 box yesterday I tried to reactivate my old 486 box and wanted to install a current Linux with latest kernel on it. But it turned out that the latest kernel does not boot because the machine crashes early in the setup code. After some debugging it turned out that the problem is the query_ist() function. If this interrupt with that function is called the machine simply locks up. It looks like a BIOS bug. Looking for a workaround for this problem I wrote the attached patch. It checks for the CPUID instruction and if it is not implemented it does not call the speedstep BIOS function. As far as I know speedstep should be available since some Pentium earliest. Alan Cox observed that it's available since the Pentium II, so cpuid levels 4 and 5 can be excluded altogether. H. Peter Anvin cleaned up the code some more: > Right in concept, but I dislike the implementation (duplication of the > CPU detect code we already have). Could you try this patch and see if > it works for you? which, with a small modification to fix a build error with it the resulting kernel boots on my machine. Signed-off-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-13 11:59:18 +02:00
Linus Torvalds	1c89ac5501	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: fix spinlock recursion in hvc_console stop_machine: remove unused variable modules: extend initcall_debug functionality to the module loader export virtio_rng.h lguest: use get_user_pages_fast() instead of get_user_pages() mm: Make generic weak get_user_pages_fast and EXPORT_GPL it lguest: don't set MAC address for guest unless specified	2008-08-12 08:40:19 -07:00
Rusty Russell	912985dce4	mm: Make generic weak get_user_pages_fast and EXPORT_GPL it Out of line get_user_pages_fast fallback implementation, make it a weak symbol, get rid of CONFIG_HAVE_GET_USER_PAGES_FAST. Export the symbol to modules so lguest can use it. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-08-12 17:52:53 +10:00
Linus Torvalds	7019b1b500	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix 2.6.27rc1 cannot boot more than 8CPUs x86: make "apic" an early_param() on 32-bit, NULL check EFI, x86: fix function prototype x86, pci-calgary: fix function declaration x86: work around gcc 3.4.x bug x86: make "apic" an early_param() on 32-bit x86, debug: tone down arch/x86/kernel/mpparse.c debugging printk x86_64: restore the proper NR_IRQS define so larger systems work. x86: Restore proper vector locking during cpu hotplug x86: Fix broken VMI in 2.6.27-rc.. x86: fdiv bug detection fix	2008-08-11 16:44:35 -07:00
Andi Kleen	9dd1e9eb5c	x86/PCI: allow scanning of 255 PCI busses Fix an old off by one error in the legacy PCI bus check. 0xff is a valid bus. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-08-11 15:23:50 -07:00
Yinghai Lu	b74548e76a	x86: fix 2.6.27rc1 cannot boot more than 8CPUs Jeff Chua reported that booting a !bigsmp kernel on a 16-way box hangs silently. this is a long-standing issue, smp start AP cpu could check the apic id >=8 etc before trying to start it. achieve this by moving the def_to_bigsmp check later and skip the apicid id > 8 [ mingo@elte.hu: clean up the message that is printed. ] Reported-by: "Jeff Chua" <jeff.chua.linux@gmail.com> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> arch/x86/kernel/setup.c \| 6 ------ arch/x86/kernel/smpboot.c \| 10 ++++++++++ 2 files changed, 10 insertions(+), 6 deletions(-)	2008-08-11 22:42:59 +02:00
Philipp Kohlbecher	59f09ba2b6	x86: fix comment in protected mode header Comments in arch/x86/boot/compressed/head_32.S erroneously refer to the real mode pointer as the second and the heap area as the third argument to decompress_kernel(). In fact, these have been the first and second argument, respectively, since v2.6.20. This patch corrects the comments. It introduces no code changes. Signed-off-by: Philipp Kohlbecher <xt28@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 21:35:30 +02:00
Rene Herman	48d97cb65e	x86: make "apic" an early_param() on 32-bit, NULL check Cyrill Gorcunov observed: > you turned it into early_param so now it's NULL injecting vulnerabled. > Could you please add checking for NULL str param? fix that. Also, change the name of 'str' into 'arg', to make it more apparent that this is an optional argument that can be NULL, not a string parameter that is empty when unset. Reported-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 19:40:38 +02:00
Randy Dunlap	9b0094f7f2	x86, pci-calgary: fix function declaration Fix function declaration: linux-next-20080807/arch/x86/kernel/pci-calgary_64.c:1353:36: warning: non-ANSI function declaration of function 'get_tce_space_from_tar' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Acked-by: Muli Ben-Yehuda <muli@il.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 18:48:45 +02:00
Jeremy Fitzhardinge	cf3e505012	x86: work around gcc 3.4.x bug Simon Horman reported that gcc-3.4.x crashes when compiling pgd_prepopulate_pmd() when PREALLOCATED_PMDS == 0 and CONFIG_DEBUG_INFO is enabled. Adding an extra check for PREALLOCATED_PMDS == 0 [which is compiled out by gcc] seems to avoid the problem. Reported-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 18:44:02 +02:00
Rene Herman	fb6bef8002	x86: make "apic" an early_param() on 32-bit On 32-bit, "apic" is a __setup() param meaning it is parsed rather late in the game. Make it an early_param() for apic_printk() use by arch/x86/kernel/mpparse.c. On 64-bit, it already is an early_param(). Signed-off-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 18:36:04 +02:00
Rene Herman	eeb0d7d113	x86, debug: tone down arch/x86/kernel/mpparse.c debugging printk commit `11a62a0560` turns some formerly nopped debugging printks in arch/x86/kernel/mppparse.c into regular ones. The one at the top of smp_scan_config() in particular also prints on !CONFIG_SMP/CONFIG_X86_LOCAL_APIC kernels and UP machines without anything resembling MP tables which makes their lowly UP owners wonder... Turn the former Dprintk()s into apic_printk()s instead meaning that their printing is dependent on passing the apic=verbose (or =debug) command line param. On 32-bit, "apic" is a __setup() param which isn't early enough for this code and therefore needs a followup changing it into an early_param(). On 64-bit, it already is. Signed-off-by: Rene Herman <rene.herman@gmail.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 18:36:03 +02:00
Huang Weiyi	8aeb402263	arch/x86/kernel/cpuid.c: removed duplicated #include Removed duplicated include file <linux/smp_lock.h> in arch/x86/kernel/cpuid.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 12:58:12 +02:00
Huang Weiyi	4c942654a4	arch/x86/kernel/acpi/boot.c: removed duplicated #include Removed duplicated include file <asm/genapic.h> in arch/x86/kernel/acpi/boot.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 12:58:11 +02:00
Ingo Molnar	6de9c70882	Merge branch 'linus' into x86/cleanups	2008-08-11 12:57:01 +02:00
Eric W. Biederman	d388e5fdc4	x86: Restore proper vector locking during cpu hotplug Having cpu_online_map change during assign_irq_vector can result in some really nasty and weird things happening. The one that bit me last time was accessing non existent per cpu memory for non existent cpus. This locking was removed in a sloppy x86_64 and x86_32 merge patch. Guys can we please try and avoid subtly breaking x86 when we are merging files together? Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-08-11 10:37:34 +02:00
Marcin Slusarz	d406d21d90	x86: mpparse.c: fix section mismatch warning WARNING: vmlinux.o(.text+0x118f7): Section mismatch in reference from the function construct_ioapic_table() to the function .init.text:MP_bus_info() The function construct_ioapic_table() references the function __init MP_bus_info(). This is often because construct_ioapic_table lacks a __init annotation or the annotation of MP_bus_info is wrong. construct_ioapic_table is called only from construct_default_ISA_mptable which is __init Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-08-10 21:13:09 -07:00
Marcin Slusarz	bafc1dae82	x86: mmconf: fix section mismatch warning WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1591): Section mismatch in reference from the function init_amd() to the function .init.text:check_enable_amd_mmconf_dmi() The function __cpuinit init_amd() references a function __init check_enable_amd_mmconf_dmi(). If check_enable_amd_mmconf_dmi is only used by init_amd then annotate check_enable_amd_mmconf_dmi with a matching annotation. check_enable_amd_mmconf_dmi is only called from init_amd which is __cpuinit Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-08-10 21:13:08 -07:00
Marcin Slusarz	85a14437ed	x86: fix MP_processor_info section mismatch warning WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1fe7): Section mismatch in reference from the function MP_processor_info() to the variable .init.data:x86_quirks The function __cpuinit MP_processor_info() references a variable __initdata x86_quirks. If x86_quirks is only used by MP_processor_info then annotate x86_quirks with a matching annotation. MP_processor_info uses x86_quirks which is __init and is used only from smp_read_mpc and construct_default_ISA_mptable which are __init Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-08-10 21:13:07 -07:00
Marcin Slusarz	90936cfe6c	x86, tsc: fix section mismatch warning WARNING: vmlinux.o(.text+0x7950): Section mismatch in reference from the function native_calibrate_tsc() to the function .init.text:tsc_read_refs() The function native_calibrate_tsc() references the function __init tsc_read_refs(). This is often because native_calibrate_tsc lacks a __init annotation or the annotation of tsc_read_refs is wrong. tsc_read_refs is called from native_calibrate_tsc which is not __init and native_calibrate_tsc cannot be marked __init Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-08-10 21:13:05 -07:00
Alok Kataria	31343d8a50	x86: Fix broken VMI in 2.6.27-rc.. The lowmem mapping table created by VMI need not depend on max_low_pfn at all. Instead we now create an extra large mapping which covers all possible lowmem instead of the physical ram that is actually available. This allows the vmi initialization to be done before max_low_pfn could be computed. We also move the vmi_init code very early in the boot process so that nobody accidentally breaks the fixmap dependancy. Signed-off-by: Alok N Kataria <akataria@vmware.com> Acked-by: Zachary Amsden <zach@vmware.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-08-08 15:22:02 -07:00
Mark Langsdorf	34ae7f35a2	[CPUFREQ][2/2] preregister support for powernow-k8 This patch provides support for the _PSD ACPI object in the Powernow-k8 driver. Although it looks like an invasive patch, most of it is simply the consequence of turning the static acpi_performance_data structure into a pointer. AMD has tested it on several machines over the past few days without issue. [trivial checkpatch warnings fixed up by davej] [X86_POWERNOW_K8_ACPI=n buildfix from Randy Dunlap] Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Tested-by: Frank Arnold <frank.arnold@amd.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-08-08 16:00:49 -04:00
Mark Langsdorf	23431b495f	[CPUFREQ][1/2] whitespace fix for powernow-k8 Trivial whitespace fix for powernow-k8. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-08-08 16:00:49 -04:00
Dave Jones	460f5ef283	[CPUFREQ] Fix warning in elanfreq arch/x86/kernel/cpu/cpufreq/elanfreq.c:47:26: warning: symbol 'elan_multiplier' was not declared. Should it be static? Yes, yes it should. Signed-off-by: Dave Jones <davej@redhat.com>	2008-08-08 16:00:48 -04:00
Dave Jones	ec983f7060	[CPUFREQ] Remove EXPERIMENTAL annotation from VIA C7 powersaver kconfig. This has been pretty solid, and doesn't see much change at all. Noticed by Harald Welte. Signed-off-by: Dave Jones <davej@redhat.com>	2008-08-08 16:00:48 -04:00
Linus Torvalds	84ff7a0012	Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: KVM: s390: Fix kvm on IBM System z10 KVM: Advertise synchronized mmu support to userspace KVM: Synchronize guest physical memory map to host virtual memory map KVM: Allow browsing memslots with mmu_lock KVM: Allow reading aliases with mmu_lock	2008-08-01 12:48:16 -07:00
Linus Torvalds	57b1494d2b	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: generic, x86: fix add iommu_num_pages helper function x86: remove stray <6> in BogoMIPS printk x86: move dma32_reserve_bootmem() after reserve_crashkernel()	2008-08-01 10:28:17 -07:00
Krzysztof Helt	e0d22d03c0	x86: fdiv bug detection fix The fdiv detection code writes s32 integer into the boot_cpu_data.fdiv_bug. However, the boot_cpu_data.fdiv_bug is only char (s8) field so the detection overwrites already set fields for other bugs, e.g. the f00f bug field. Use local s32 variable to receive result. This is a partial fix to Bugzilla #9928 - fixes wrong information about the f00f bug (tested) and probably for coma bug (I have no cpu to test this). Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-31 23:56:27 +02:00
Gustavo F. Padovan	e9c8abb66c	x86: coding style fixes to arch/x86/kernel/sys_x86_64.c Fix all errors and many warnings reported by checkpatch.pl without change sys_x86_64.o arch/x86/kernel/sys_x86_64.o: text data bss dec hex filename 1567 0 0 1567 61f sys_x86_64.o.after 1567 0 0 1567 61f sys_x86_64.o.before md5: de28ffedcb5851dfd7ec87a03afec1fd sys_x86_64.o.after de28ffedcb5851dfd7ec87a03afec1fd sys_x86_64.o.before Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-31 18:21:33 +02:00
Gustavo F. Padovan	4df9e510a9	x86: coding style fixes to arch/x86/kernel/traps_64.c Fix all errors and many warnings reported by checkpath.pl. Except the change of include <asm/io.h> to <linux/io.h> the traps.o before and after changes are the same. Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-31 18:21:32 +02:00
Gustavo F. Padovan	caa007dd36	x86: coding style fixes to arch/x86/kernel/signal_64.c Fix all errors and many warnings reported by checkpatch.pl without change signal_64.o arch/x86/kernel/signal_64.o text data bss dec hex filename 5143 0 8 5151 141f signal_64.o.after 5143 0 8 5151 141f signal_64.o.before md5: e68718092b3641cb27e79e55ce57e3ad signal_64.o.after e68718092b3641cb27e79e55ce57e3ad signal_64.o.before Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-31 18:21:32 +02:00
Gustavo F. Padovan	08aadf069d	x86: coding style fixes to arch/x86/kernel/crash_dump_64.c Fix conding style without change crash_dump_64.o arch/x86/kernel/crash_dump_64.o text data bss dec hex filename 129 0 0 129 81 crash_dump_64.o.after 129 0 0 129 81 crash_dump_64.o.before md5: 885b52c1b92737e6b12e5107e90fc1f1 crash_dump_64.o.after 885b52c1b92737e6b12e5107e90fc1f1 crash_dump_64.o.before Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-31 18:21:31 +02:00
Gustavo F. Padovan	8092c654de	x86: add KERN_INFO to printks on process_64.c Fix many coding style warnings. Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-31 18:21:31 +02:00
Gustavo F. Padovan	7de08b4e1e	x86: coding styles fixes to arch/x86/kernel/process_64.c Fix about 50 errors and many warnings without change process_64.o arch/x86/kernel/process_64.o: text data bss dec hex filename 5236 8 24 5268 1494 process_64.o.after 5236 8 24 5268 1494 process_64.o.before md5: 9c35e9debdea4e471288c6e8ca267a75 process_64.o.after 9c35e9debdea4e471288c6e8ca267a75 process_64.o.before Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-31 18:21:30 +02:00
H. Peter Anvin	6152e4b1c9	x86, xsave: keep the XSAVE feature mask as an u64 The XSAVE feature mask is a 64-bit number; keep it that way, in order to avoid the mistake done with rdmsr/wrmsr. Use the xsetbv() function provided in the previous patch. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-30 19:50:35 +02:00
Suresh Siddha	42deec6f2c	x86, xsave: update xsave header bits during ptrace fpregs set FP/SSE bits may be zero in the xsave header(representing the init state). Update these bits during the ptrace fpregs set operation, to indicate the non-init state. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-30 19:49:28 +02:00
Suresh Siddha	c37b5efea4	x86, xsave: save/restore the extended state context in sigframe On cpu's supporting xsave/xrstor, fpstate pointer in the sigcontext, will include the extended state information along with fpstate information. Presence of extended state information is indicated by the presence of FP_XSTATE_MAGIC1 at fpstate.sw_reserved.magic1 and FP_XSTATE_MAGIC2 at fpstate + (fpstate.sw_reserved.extended_size - FP_XSTATE_MAGIC2_SIZE). Extended feature bit mask that is saved in the memory layout is represented by the fpstate.sw_reserved.xstate_bv For RT signal frames, UC_FP_XSTATE in the uc_flags also indicate the presence of extended state information in the sigcontext's fpstate pointer. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-30 19:49:27 +02:00
Suresh Siddha	ab5137015f	x86, xsave: reorganization of signal save/restore fpstate code layout move 64bit routines that saves/restores fpstate in/from user stack from signal_64.c to xsave.c restore_i387_xstate() now handles the condition when user passes NULL fpstate. Other misc changes for prepartion of xsave/xrstor sigcontext support. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-30 19:49:26 +02:00
Suresh Siddha	3c1c7f1014	x86, xsave: dynamically allocate sigframes fpstate instead of static allocation dynamically allocate fpstate on the stack, instead of static allocation in the current sigframe layout on the user stack. This will allow the fpstate structure to grow in the future, which includes extended state information supporting xsave/xrstor. signal handlers will be able to access the fpstate pointer from the sigcontext structure asusual, with no change. For the non RT sigframe's (which are supported only for 32bit apps), current static fpstate layout in the sigframe will be unused(so that we don't change the extramask[] offset in the sigframe and thus prevent breaking app's which modify extramask[]). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-30 19:49:25 +02:00
Suresh Siddha	b359e8a434	x86, xsave: context switch support using xsave/xrstor Uses xsave/xrstor (instead of traditional fxsave/fxrstor) in context switch when available. Introduces TS_XSAVE flag, which determine the need to use xsave/xrstor instructions during context switch instead of the legacy fxsave/fxrstor instructions. Thread-synchronous status word is already in L1 cache during this code patch and thus minimizes the performance penality compared to (cpu_has_xsave) checks. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-30 19:49:24 +02:00
Suresh Siddha	dc1e35c6e9	x86, xsave: enable xsave/xrstor on cpus with xsave support Enables xsave/xrstor by turning on cr4.osxsave on cpu's which have the xsave support. For now, features that OS supports/enabled are FP and SSE. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-30 19:49:24 +02:00
Suresh Siddha	a648bf4632	x86, xsave: xsave cpuid feature bits Add xsave CPU feature bits. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-30 19:49:23 +02:00
Ingo Molnar	15dd859cac	Merge commit 'v2.6.27-rc1' into x86/core Conflicts: include/asm-x86/dma-mapping.h include/asm-x86/namei.h include/asm-x86/uaccess.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-30 19:33:48 +02:00
Ingo Molnar	b2d9d33412	Merge branch 'x86/fpu' into x86/core	2008-07-30 19:32:39 +02:00
Vitaly Mayatskikh	afd962a9e8	x86: wrong register was used in align macro New ALIGN_DESTINATION macro has sad typo: r8d register was used instead of ecx in fixup section. This can be considered as a regression. Register ecx was also wrongly loaded with value in r8d in copy_user_nocache routine. Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 10:10:39 -07:00
Jack Steiner	0d39741a27	GRU Driver: export is_uv_system(), zap_page_range() & follow_page() Exports needed by the GRU driver. Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:48 -07:00
FUJITA Tomonori	8978b74253	generic, x86: fix add iommu_num_pages helper function This IOMMU helper function doesn't work for some architectures: http://marc.info/?l=linux-kernel&m=121699304403202&w=2 It also breaks POWER and SPARC builds: http://marc.info/?l=linux-kernel&m=121730388001890&w=2 Currently, only x86 IOMMUs use this so let's move it to x86 for now. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-29 12:12:48 +02:00
Ingo Molnar	35780c8ea7	Merge commit 'v2.6.27-rc1' into x86/urgent	2008-07-29 12:10:50 +02:00
Avi Kivity	ed84862433	KVM: Advertise synchronized mmu support to userspace Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-29 12:34:02 +03:00
Andrea Arcangeli	e930bffe95	KVM: Synchronize guest physical memory map to host virtual memory map Synchronize changes to host virtual addresses which are part of a KVM memory slot to the KVM shadow mmu. This allows pte operations like swapping, page migration, and madvise() to transparently work with KVM. Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-29 12:33:53 +03:00
Andrea Arcangeli	604b38ac03	KVM: Allow browsing memslots with mmu_lock This allows reading memslots with only the mmu_lock hold for mmu notifiers that runs in atomic context and with mmu_lock held. Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-29 12:33:50 +03:00
Andrea Arcangeli	a1708ce8a3	KVM: Allow reading aliases with mmu_lock This allows the mmu notifier code to run unalias_gfn with only the mmu_lock held. Only alias writes need the mmu_lock held. Readers will either take the slots_lock in read mode or the mmu_lock. Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-29 12:33:40 +03:00
Linus Torvalds	7874d35173	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: lguest: turn Waker into a thread, not a process lguest: Enlarge virtio rings lguest: Use GSO/IFF_VNET_HDR extensions on tun/tap lguest: Remove 'network: no dma buffer!' warning lguest: Adaptive timeout lguest: Tell Guest net not to notify us on every packet xmit lguest: net block unneeded receive queue update notifications lguest: wrap last_avail accesses. lguest: use cpu capability accessors lguest: virtio-rng support lguest: Support assigning a MAC address lguest: Don't leak /dev/zero fd lguest: fix verbose printing of device features. lguest: fix switcher_page leak on unload lguest: Guest int3 fix lguest: set max_pfn_mapped, growl loudly at Yinghai Lu	2008-07-28 18:16:26 -07:00
Linus Torvalds	1d9b9f6a53	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (21 commits) x86/PCI: use dev_printk when possible PCI: add D3 power state avoidance quirk PCI: fix bogus "'device' may be used uninitialized" warning in pci_slot PCI: add an option to allow ASPM enabled forcibly PCI: disable ASPM on pre-1.1 PCIe devices PCI: disable ASPM per ACPI FADT setting PCI MSI: Don't disable MSIs if the mask bit isn't supported PCI: handle 64-bit resources better on 32-bit machines PCI: rewrite PCI BAR reading code PCI: document pci_target_state PCI hotplug: fix typo in pcie hotplug output x86 gart: replace to_pages macro with iommu_num_pages x86, AMD IOMMU: replace to_pages macro with iommu_num_pages iommu: add iommu_num_pages helper function dma-coherent: add documentation to new interfaces Cris: convert to using generic dma-coherent mem allocator Sh: use generic per-device coherent dma allocator ARM: support generic per-device coherent dma mem Generic dma-coherent: fix DMA_MEMORY_EXCLUSIVE x86: use generic per-device dma coherent allocator ...	2008-07-28 18:14:24 -07:00
Linus Torvalds	9b79022ca9	Fix 'get_user_pages_fast()' with non-page-aligned start address Alexey Dobriyan reported trouble with LTP with the new fast-gup code, and Johannes Weiner debugged it to non-page-aligned addresses, where the new get_user_pages_fast() code would do all the wrong things, including just traversing past the end of the requested area due to 'addr' never matching 'end' exactly. This is not a pretty fix, and we may actually want to move the alignment into generic code, leaving just the core code per-arch, but Alexey verified that the vmsplice01 LTP test doesn't crash with this. Reported-and-tested-by: Alexey Dobriyan <adobriyan@gmail.com> Debugged-by: Johannes Weiner <hannes@saeurebad.de> Cc: Nick Piggin <npiggin@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 17:54:21 -07:00
Rusty Russell	5d006d8d09	lguest: set max_pfn_mapped, growl loudly at Yinghai Lu `6af61a7614` 'x86: clean up max_pfn_mapped usage - 32-bit' makes the following comment: XEN PV and lguest may need to assign max_pfn_mapped too. But no CC. Yinghai, wasting fellow developers' time is a VERY bad habit. If you do it again, I will hunt you down and try to extract the three hours of my life I just lost :) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Yinghai Lu <yhlu.kernel@gmail.com>	2008-07-29 09:58:31 +10:00
Andrea Arcangeli	cddb8a5c14	mmu-notifiers: core With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to pages. There are secondary MMUs (with secondary sptes and secondary tlbs) too. sptes in the kvm case are shadow pagetables, but when I say spte in mmu-notifier context, I mean "secondary pte". In GRU case there's no actual secondary pte and there's only a secondary tlb because the GRU secondary MMU has no knowledge about sptes and every secondary tlb miss event in the MMU always generates a page fault that has to be resolved by the CPU (this is not the case of KVM where the a secondary tlb miss will walk sptes in hardware and it will refill the secondary tlb transparently to software if the corresponding spte is present). The same way zap_page_range has to invalidate the pte before freeing the page, the spte (and secondary tlb) must also be invalidated before any page is freed and reused. Currently we take a page_count pin on every page mapped by sptes, but that means the pages can't be swapped whenever they're mapped by any spte because they're part of the guest working set. Furthermore a spte unmap event can immediately lead to a page to be freed when the pin is released (so requiring the same complex and relatively slow tlb_gather smp safe logic we have in zap_page_range and that can be avoided completely if the spte unmap event doesn't require an unpin of the page previously mapped in the secondary MMU). The mmu notifiers allow kvm/GRU/XPMEM to attach to the tsk->mm and know when the VM is swapping or freeing or doing anything on the primary MMU so that the secondary MMU code can drop sptes before the pages are freed, avoiding all page pinning and allowing 100% reliable swapping of guest physical address space. Furthermore it avoids the code that teardown the mappings of the secondary MMU, to implement a logic like tlb_gather in zap_page_range that would require many IPI to flush other cpu tlbs, for each fixed number of spte unmapped. To make an example: if what happens on the primary MMU is a protection downgrade (from writeable to wrprotect) the secondary MMU mappings will be invalidated, and the next secondary-mmu-page-fault will call get_user_pages and trigger a do_wp_page through get_user_pages if it called get_user_pages with write=1, and it'll re-establishing an updated spte or secondary-tlb-mapping on the copied page. Or it will setup a readonly spte or readonly tlb mapping if it's a guest-read, if it calls get_user_pages with write=0. This is just an example. This allows to map any page pointed by any pte (and in turn visible in the primary CPU MMU), into a secondary MMU (be it a pure tlb like GRU, or an full MMU with both sptes and secondary-tlb like the shadow-pagetable layer with kvm), or a remote DMA in software like XPMEM (hence needing of schedule in XPMEM code to send the invalidate to the remote node, while no need to schedule in kvm/gru as it's an immediate event like invalidating primary-mmu pte). At least for KVM without this patch it's impossible to swap guests reliably. And having this feature and removing the page pin allows several other optimizations that simplify life considerably. Dependencies: 1) mm_take_all_locks() to register the mmu notifier when the whole VM isn't doing anything with "mm". This allows mmu notifier users to keep track if the VM is in the middle of the invalidate_range_begin/end critical section with an atomic counter incraese in range_begin and decreased in range_end. No secondary MMU page fault is allowed to map any spte or secondary tlb reference, while the VM is in the middle of range_begin/end as any page returned by get_user_pages in that critical section could later immediately be freed without any further ->invalidate_page notification (invalidate_range_begin/end works on ranges and ->invalidate_page isn't called immediately before freeing the page). To stop all page freeing and pagetable overwrites the mmap_sem must be taken in write mode and all other anon_vma/i_mmap locks must be taken too. 2) It'd be a waste to add branches in the VM if nobody could possibly run KVM/GRU/XPMEM on the kernel, so mmu notifiers will only enabled if CONFIG_KVM=m/y. In the current kernel kvm won't yet take advantage of mmu notifiers, but this already allows to compile a KVM external module against a kernel with mmu notifiers enabled and from the next pull from kvm.git we'll start using them. And GRU/XPMEM will also be able to continue the development by enabling KVM=m in their config, until they submit all GRU/XPMEM GPLv2 code to the mainline kernel. Then they can also enable MMU_NOTIFIERS in the same way KVM does it (even if KVM=n). This guarantees nobody selects MMU_NOTIFIER=y if KVM and GRU and XPMEM are all =n. The mmu_notifier_register call can fail because mm_take_all_locks may be interrupted by a signal and return -EINTR. Because mmu_notifier_reigster is used when a driver startup, a failure can be gracefully handled. Here an example of the change applied to kvm to register the mmu notifiers. Usually when a driver startups other allocations are required anyway and -ENOMEM failure paths exists already. struct kvm kvm_arch_create_vm(void) { struct kvm kvm = kzalloc(sizeof(struct kvm), GFP_KERNEL); + int err; if (!kvm) return ERR_PTR(-ENOMEM); INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); + kvm->arch.mmu_notifier.ops = &kvm_mmu_notifier_ops; + err = mmu_notifier_register(&kvm->arch.mmu_notifier, current->mm); + if (err) { + kfree(kvm); + return ERR_PTR(err); + } + return kvm; } mmu_notifier_unregister returns void and it's reliable. The patch also adds a few needed but missing includes that would prevent kernel to compile after these changes on non-x86 archs (x86 didn't need them by luck). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix mm/filemap_xip.c build] [akpm@linux-foundation.org: fix mm/mmu_notifier.c build] Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Cc: Robin Holt <holt@sgi.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Kanoj Sarcar <kanojsarcar@yahoo.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Avi Kivity <avi@qumranet.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Chris Wright <chrisw@redhat.com> Cc: Marcelo Tosatti <marcelo@kvack.org> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Izik Eidus <izike@qumranet.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:21 -07:00
Bjorn Helgaas	12c0b20fa4	x86/PCI: use dev_printk when possible Convert printks to use dev_printk(). I converted DBG() to dev_dbg(). This DBG() is from arch/x86/pci/pci.h and requires source-code modification to enable, so dev_dbg() seems roughly equivalent. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2008-07-28 15:32:26 -07:00
Jesse Barnes	756f7bc668	Merge branch 'core/generic-dma-coherent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip into for-linus	2008-07-28 15:15:46 -07:00
Ingo Molnar	cb28a1bbdb	Merge branch 'linus' into core/generic-dma-coherent Conflicts: arch/x86/Kconfig Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-29 00:07:55 +02:00
Jesse Barnes	29111f579f	Merge branch 'x86/iommu' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip into for-linus	2008-07-28 14:31:10 -07:00
Linus Torvalds	e56b3bc794	cpu masks: optimize and clean up cpumask_of_cpu() Clean up and optimize cpumask_of_cpu(), by sharing all the zero words. Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns creating a huge array of constant bitmasks, realize that the zero words can be shared. In other words, on a 64-bit architecture, we only ever need 64 of these arrays - with a different bit set in one single world (with enough zero words around it so that we can create any bitmask by just offsetting in that big array). And then we just put enough zeroes around it that we can point every single cpumask to be one of those things. So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each, with one bit set in each array - 2MB memory total), we have exactly 64 arrays instead, each 8k bits in size (64kB total). And then we just point cpumask(n) to the right position (which we can calculate dynamically). Once we have the right arrays, getting "cpumask(n)" ends up being: static inline const cpumask_t get_cpu_mask(unsigned int cpu) { const unsigned long p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG]; p -= cpu / BITS_PER_LONG; return (const cpumask_t *)p; } This brings other advantages and simplifications as well: - we are not wasting memory that is just filled with a single bit in various different places - we don't need all those games to re-create the arrays in some dense format, because they're already going to be dense enough. if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory is a non-issue (especially since by doing this "overlapping" trick we probably get better cache behaviour anyway). [ mingo@elte.hu: Converted Linus's mails into a commit. See: http://lkml.org/lkml/2008/7/27/156 http://lkml.org/lkml/2008/7/28/320 Also applied a family filter - which also has the side-effect of leaving out the bits where Linus calls me an idio... Oh, never mind ;-) ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 22:20:41 +02:00
Ingo Molnar	414f746d23	Merge branch 'linus' into cpus4096	2008-07-28 21:14:43 +02:00
Ingo Molnar	6ce37a58e3	Merge branch 'x86/crashdump' into x86/urgent	2008-07-28 17:19:02 +02:00
Ingo Molnar	71998e83c5	Merge branch 'x86-tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace into x86/tracehook	2008-07-28 17:03:43 +02:00
Ingo Molnar	239bd83104	x86: L3 cache index disable for 2.6.26, fix #2 fix !PCI build failure: arch/x86/kernel/cpu/intel_cacheinfo.c: In function 'get_k8_northbridge': arch/x86/kernel/cpu/intel_cacheinfo.c:675: error: implicit declaration of function 'pci_match_id' Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 16:49:50 +02:00
Ingo Molnar	b7d0b67845	Merge branch 'linus' into x86/cpu Conflicts: arch/x86/kernel/cpu/intel_cacheinfo.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 16:26:31 +02:00
Ingo Molnar	cdcf772ed1	x86 l3 cache index disable for 2 6 26 fix Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 16:22:07 +02:00
Mark Langsdorf	a24e8d36f5	x86: L3 cache index disable for 2.6.26 On Monday 21 July 2008, Ingo Molnar wrote: > > applied to tip/x86/cpu, thanks Mark. > > > > I've done some coding style fixes for the new functions you've > > introduced, see that commit below. > > -tip testing found the following build failure: > > arch/x86/kernel/built-in.o: In function `show_cache_disable': > intel_cacheinfo.c:(.text+0xbbf2): undefined reference to `k8_northbridges' > arch/x86/kernel/built-in.o: In function `store_cache_disable': > intel_cacheinfo.c:(.text+0xbd91): undefined reference to `k8_northbridges' > > please send a delta fix patch against the tip/x86/cpu branch: > > http://people.redhat.com/mingo/tip.git/README > > which has your patch plus the cleanup applied. delta fix patch follows. It removes the dependency on k8_northbridges. -Mark Langsdorf Operating System Research Center AMD Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 16:22:06 +02:00
Ingo Molnar	7a4983bb5f	x86: L3 cache index disable for 2.6.26, cleanups No change in functionality. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 16:17:47 +02:00
Mark Langsdorf	8cb22bcb1f	x86: L3 cache index disable for 2.6.26 New versions of AMD processors have support to disable parts of their L3 caches if too many MCEs are generated by the L3 cache. This patch provides a /sysfs interface under the cache hierarchy to display which caches indices are disabled (if any) and to monitoring applications to disable a cache index. This patch does not set an automatic policy to disable the L3 cache. Policy decisions would need to be made by a RAS handler. This patch merely makes it easier to see what indices are currently disabled. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-28 16:17:43 +02:00
Linus Torvalds	fb4284b2b7	Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip: x86: fix cpu hotplug on 32bit	2008-07-27 16:46:51 -07:00
Thomas Gleixner	583323b9d2	x86: fix cpu hotplug on 32bit commit `3e9704739d` ("x86: boot secondary cpus through initial_code") causes the kernel to crash when a CPU is brought online after the read only sections have been write protected. The write to initial_code in do_boot_cpu() fails. Move inital_code to .cpuinit.data section. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: H. Peter Anvin <hpa@zytor.com>	2008-07-27 21:43:11 +02:00
Sheng Yang	5fdbcb9dd1	KVM: VMX: Fix undefined beaviour of EPT after reload kvm-intel.ko As well as move set base/mask ptes to vmx_init(). Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-27 11:34:10 +03:00
Sheng Yang	5ec5726a16	KVM: VMX: Fix bypass_guest_pf enabling when disable EPT in module parameter Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-27 11:34:10 +03:00
Marcelo Tosatti	c93cd3a588	KVM: task switch: translate guest segment limit to virt-extension byte granular field If 'g' is one then limit is 4kb granular. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-27 11:34:10 +03:00
Avi Kivity	577bdc4966	KVM: Avoid instruction emulation when event delivery is pending When an event (such as an interrupt) is injected, and the stack is shadowed (and therefore write protected), the guest will exit. The current code will see that the stack is shadowed and emulate a few instructions, each time postponing the injection. Eventually the injection may succeed, but at that time the guest may be unwilling to accept the interrupt (for example, the TPR may have changed). This occurs every once in a while during a Windows 2008 boot. Fix by unshadowing the fault address if the fault was due to an event injection. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-27 11:34:10 +03:00
Marcelo Tosatti	34198bf842	KVM: task switch: use seg regs provided by subarch instead of reading from GDT There is no guarantee that the old TSS descriptor in the GDT contains the proper base address. This is the case for Windows installation's reboot-via-triplefault. Use guest registers instead. Also translate the address properly. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-27 11:34:09 +03:00
Marcelo Tosatti	98899aa0e0	KVM: task switch: segment base is linear address The segment base is always a linear address, so translate before accessing guest memory. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-27 11:34:09 +03:00
Joerg Roedel	5f4cb662a0	KVM: SVM: allow enabling/disabling NPT by reloading only the architecture module If NPT is enabled after loading both KVM modules on AMD and it should be disabled, both KVM modules must be reloaded. If only the architecture module is reloaded the behavior is undefined. With this patch it is possible to disable NPT only by reloading the kvm_amd module. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-27 11:34:09 +03:00
Yinghai Lu	d25ae38b7e	x86: add apic probe for genapic 64bit - fix intr_remapping_enabled get assigned later, so need to check that in setup_apic_routing Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-27 06:28:16 +02:00
Roland McGrath	99bbc4b1e6	x86: tracehook: CONFIG_HAVE_ARCH_TRACEHOOK The x86 arch code has all the prerequisites, so set HAVE_ARCH_TRACEHOOK. Signed-off-by: Roland McGrath <roland@redhat.com>	2008-07-26 14:38:06 -07:00
Roland McGrath	59e52130f0	x86: tracehook: TIF_NOTIFY_RESUME This adds TIF_NOTIFY_RESUME support for x86, both 64-bit and 32-bit. When set, we call tracehook_notify_resume() on the way to user mode. Signed-off-by: Roland McGrath <roland@redhat.com>	2008-07-26 14:38:05 -07:00
Roland McGrath	4dfcbb997a	x86 signals: use asm/syscall.h Replace local inlines with the asm/syscall.h interfaces that do the same things. Signed-off-by: Roland McGrath <roland@redhat.com>	2008-07-26 14:38:04 -07:00
Roland McGrath	eeea3c3ff8	x86: tracehook syscall This changes x86 syscall tracing to use the new tracehook.h entry points. There is no change, only cleanup. Signed-off-by: Roland McGrath <roland@redhat.com>	2008-07-26 14:38:00 -07:00
Roland McGrath	36a033082b	x86: tracehook_signal_handler This makes the x86 signal handling code use tracehook_signal_handler() in place of calling into ptrace guts. The call is moved after the sa_mask processing, but there is no other change. This cleanup doesn't matter to existing debuggers, but is the sensible thing: have all facets of the handler setup complete before the debugger inspects the task again. Signed-off-by: Roland McGrath <roland@redhat.com>	2008-07-26 14:37:59 -07:00
Linus Torvalds	fb3b806144	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, AMD IOMMU: include amd_iommu_last_bdf in device initialization x86: fix IBM Summit based systems' phys_cpu_present_map on 32-bit kernels x86, RDC321x: remove gpio.h complications x86, RDC321x: add to mach-default crashdump: fix undefined reference to `elfcorehdr_addr' flag parameters: fix compile error of sys_epoll_create1	2008-07-26 13:25:05 -07:00
Johannes Weiner	8dad322f54	x86: use generic show_mem() Remove arch-specific show_mem() in favor of the generic version. This also removes the following redundant information display: - pages in swapcache, printed by show_swap_cache_info() - dirty pages, writeback pages, mapped pages, slab pages, pagetable pages, printed by show_free_areas() where show_mem() calls show_free_areas(), which calls show_swap_cache_info(). Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:10 -07:00
Roland McGrath	6341c393fc	tracehook: exec This moves all the ptrace hooks related to exec into tracehook.h inlines. This also lifts the calls for tracing out of the binfmt load_binary hooks into search_binary_handler() after it calls into the binfmt module. This change has no effect, since all the binfmt modules' load_binary functions did the call at the end on success, and now search_binary_handler() does it immediately after return if successful. We consolidate the repeated code, and binfmt modules no longer need to import ptrace_notify(). Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:08 -07:00
Nick Piggin	652ea69536	x86: support 1GB hugepages with get_user_pages_lockless() Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:06 -07:00
Nick Piggin	8174c430e4	x86: lockless get_user_pages_fast() Implement get_user_pages_fast without locking in the fastpath on x86. Do an optimistic lockless pagetable walk, without taking mmap_sem or any page table locks or even mmap_sem. Page table existence is guaranteed by turning interrupts off (combined with the fact that we're always looking up the current mm, means we can do the lockless page table walk within the constraints of the TLB shootdown design). Basically we can do this lockless pagetable walk in a similar manner to the way the CPU's pagetable walker does not have to take any locks to find present ptes. This patch (combined with the subsequent ones to convert direct IO to use it) was found to give about 10% performance improvement on a 2 socket 8 core Intel Xeon system running an OLTP workload on DB2 v9.5 "To test the effects of the patch, an OLTP workload was run on an IBM x3850 M2 server with 2 processors (quad-core Intel Xeon processors at 2.93 GHz) using IBM DB2 v9.5 running Linux 2.6.24rc7 kernel. Comparing runs with and without the patch resulted in an overall performance benefit of ~9.8%. Correspondingly, oprofiles showed that samples from __up_read and __down_read routines that is seen during thread contention for system resources was reduced from 2.8% down to .05%. Monitoring the /proc/vmstat output from the patched run showed that the counter for fast_gup contained a very high number while the fast_gup_slow value was zero." (fast_gup is the old name for get_user_pages_fast, fast_gup_slow is a counter we had for the number of times the slowpath was invoked). The main reason for the improvement is that DB2 has multiple threads each issuing direct-IO. Direct-IO uses get_user_pages, and thus the threads contend the mmap_sem cacheline, and can also contend on page table locks. I would anticipate larger performance gains on larger systems, however I think DB2 uses an adaptive mix of threads and processes, so it could be that thread contention remains pretty constant as machine size increases. In which case, we stuck with "only" a 10% gain. The downside of using get_user_pages_fast is that if there is not a pte with the correct permissions for the access, we end up falling back to get_user_pages and so the get_user_pages_fast is a bit of extra work. However this should not be the common case in most performance critical code. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: Kconfig fix] [akpm@linux-foundation.org: Makefile fix/cleanup] [akpm@linux-foundation.org: warning fix] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:06 -07:00
Huang Ying	89081d17f7	kexec jump: save/restore device state This patch implements devices state save/restore before after kexec. This patch together with features in kexec_jump patch can be used for following: - A simple hibernation implementation without ACPI support. You can kexec a hibernating kernel, save the memory image of original system and shutdown the system. When resuming, you restore the memory image of original system via ordinary kexec load then jump back. - Kernel/system debug through making system snapshot. You can make system snapshot, jump back, do some thing and make another system snapshot. - Cooperative multi-kernel/system. With kexec jump, you can switch between several kernels/systems quickly without boot process except the first time. This appears like swap a whole kernel/system out/in. - A general method to call program in physical mode (paging turning off). This can be used to invoke BIOS code under Linux. The following user-space tools can be used with kexec jump: - kexec-tools needs to be patched to support kexec jump. The patches and the precompiled kexec can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10 - makedumpfile with patches are used as memory image saving tool, it can exclude free pages from original kernel memory image file. The patches and the precompiled makedumpfile can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-src_cvs_kh10.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-patches_cvs_kh10.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile_cvs_kh10 - An initramfs image can be used as the root file system of kexeced kernel. An initramfs image built with "BuildRoot" can be downloaded from the following URL: initramfs image: http://khibernation.sourceforge.net/download/release_v10/initramfs/rootfs_cvs_kh10.gz All user space tools above are included in the initramfs image. Usage example of simple hibernation: 1. Compile and install patched kernel with following options selected: CONFIG_X86_32=y CONFIG_RELOCATABLE=y CONFIG_KEXEC=y CONFIG_CRASH_DUMP=y CONFIG_PM=y CONFIG_HIBERNATION=y CONFIG_KEXEC_JUMP=y 2. Build an initramfs image contains kexec-tool and makedumpfile, or download the pre-built initramfs image, called rootfs.gz in following text. 3. Prepare a partition to save memory image of original kernel, called hibernating partition in following text. 4. Boot kernel compiled in step 1 (kernel A). 5. In the kernel A, load kernel compiled in step 1 (kernel B) with /sbin/kexec. The shell command line can be as follow: /sbin/kexec --load-preserve-context /boot/bzImage --mem-min=0x100000 --mem-max=0xffffff --initrd=rootfs.gz 6. Boot the kernel B with following shell command line: /sbin/kexec -e 7. The kernel B will boot as normal kexec. In kernel B the memory image of kernel A can be saved into hibernating partition as follow: jump_back_entry=`cat /proc/cmdline \| tr ' ' '\n' \| grep kexec_jump_back_entry \| cut -d '='` echo $jump_back_entry > kexec_jump_back_entry cp /proc/vmcore dump.elf Then you can shutdown the machine as normal. 8. Boot kernel compiled in step 1 (kernel C). Use the rootfs.gz as root file system. 9. In kernel C, load the memory image of kernel A as follow: /sbin/kexec -l --args-none --entry=`cat kexec_jump_back_entry` dump.elf 10. Jump back to the kernel A as follow: /sbin/kexec -e Then, kernel A is resumed. Implementation point: To support jumping between two kernels, before jumping to (executing) the new kernel and jumping back to the original kernel, the devices are put into quiescent state, and the state of devices and CPU is saved. After jumping back from kexeced kernel and jumping to the new kernel, the state of devices and CPU are restored accordingly. The devices/CPU state save/restore code of software suspend is called to implement corresponding function. Known issues: - Because the segment number supported by sys_kexec_load is limited, hibernation image with many segments may not be load. This is planned to be eliminated by adding a new flag to sys_kexec_load to make a image can be loaded with multiple sys_kexec_load invoking. Now, only the i386 architecture is supported. Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Huang Ying	3ab8352137	kexec jump This patch provides an enhancement to kexec/kdump. It implements the following features: - Backup/restore memory used by the original kernel before/after kexec. - Save/restore CPU state before/after kexec. The features of this patch can be used as a general method to call program in physical mode (paging turning off). This can be used to call BIOS code under Linux. kexec-tools needs to be patched to support kexec jump. The patches and the precompiled kexec can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10 Usage example of calling some physical mode code and return: 1. Compile and install patched kernel with following options selected: CONFIG_X86_32=y CONFIG_KEXEC=y CONFIG_PM=y CONFIG_KEXEC_JUMP=y 2. Build patched kexec-tool or download the pre-built one. 3. Build some physical mode executable named such as "phy_mode" 4. Boot kernel compiled in step 1. 5. Load physical mode executable with /sbin/kexec. The shell command line can be as follow: /sbin/kexec --load-preserve-context --args-none phy_mode 6. Call physical mode executable with following shell command line: /sbin/kexec -e Implementation point: To support jumping without reserving memory. One shadow backup page (source page) is allocated for each page used by kexeced code image (destination page). When do kexec_load, the image of kexeced code is loaded into source pages, and before executing, the destination pages and the source pages are swapped, so the contents of destination pages are backupped. Before jumping to the kexeced code image and after jumping back to the original kernel, the destination pages and the source pages are swapped too. C ABI (calling convention) is used as communication protocol between kernel and called code. A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to indicate that the loaded kernel image is used for jumping back. Now, only the i386 architecture is supported. Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:04 -07:00
Alexis Bruemmer	1956a96de4	x86 calgary: fix handling of devices that aren't behind the Calgary The calgary code can give drivers addresses above 4GB which is very bad for hardware that is only 32bit DMA addressable. With this patch, the calgary code sets the global dma_ops to swiotlb or nommu properly, and the dma_ops of devices behind the Calgary/CalIOC2 to calgary_dma_ops. So the calgary code can handle devices safely that aren't behind the Calgary/CalIOC2. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Alexis Bruemmer <alexisb@us.ibm.com> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:03 -07:00
FUJITA Tomonori	8d8bb39b9e	dma-mapping: add the device argument to dma_mapping_error() Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER architecture does: This enables us to cleanly fix the Calgary IOMMU issue that some devices are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423). I think that per-device dma_mapping_ops support would be also helpful for KVM people to support PCI passthrough but Andi thinks that this makes it difficult to support the PCI passthrough (see the above thread). So I CC'ed this to KVM camp. Comments are appreciated. A pointer to dma_mapping_ops to struct dev_archdata is added. If the pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's NULL, the system-wide dma_ops pointer is used as before. If it's useful for KVM people, I plan to implement a mechanism to register a hook called when a new pci (or dma capable) device is created (it works with hot plugging). It enables IOMMUs to set up an appropriate dma_mapping_ops per device. The major obstacle is that dma_mapping_error doesn't take a pointer to the device unlike other DMA operations. So x86 can't have dma_mapping_ops per device. Note all the POWER IOMMUs use the same dma_mapping_error function so this is not a problem for POWER but x86 IOMMUs use different dma_mapping_error functions. The first patch adds the device argument to dma_mapping_error. The patch is trivial but large since it touches lots of drivers and dma-mapping.h in all the architecture. This patch: dma_mapping_error() doesn't take a pointer to the device unlike other DMA operations. So we can't have dma_mapping_ops per device. Note that POWER already has dma_mapping_ops per device but all the POWER IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device argument. [akpm@linux-foundation.org: fix sge] [akpm@linux-foundation.org: fix svc_rdma] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix bnx2x] [akpm@linux-foundation.org: fix s2io] [akpm@linux-foundation.org: fix pasemi_mac] [akpm@linux-foundation.org: fix sdhci] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix sparc] [akpm@linux-foundation.org: fix ibmvscsi] Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:03 -07:00
Ingo Molnar	3964cd3a67	x86: visws_quirks, fix build error fix: arch/x86/kernel/visws_quirks.c: In function ‘visws_early_detect’: arch/x86/kernel/visws_quirks.c:290: error: ‘skip_ioapic_setup’ undeclared (first use in this function) arch/x86/kernel/visws_quirks.c:290: error: (Each undeclared identifier is reported only once arch/x86/kernel/visws_quirks.c:290: error: for each function it appears in.) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 19:33:27 +02:00
Mike Travis	0bc3cc03fa	cpumask: change cpumask_of_cpu_ptr to use new cpumask_of_cpu * Replace previous instances of the cpumask_of_cpu_ptr* macros with a the new (lvalue capable) generic cpumask_of_cpu(). Signed-off-by: Mike Travis <travis@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:40:33 +02:00
Mike Travis	6524d938b3	cpumask: put cpumask_of_cpu_map in the initdata section * Create the cpumask_of_cpu_map statically in the init data section using NR_CPUS but replace it during boot up with one sized by nr_cpu_ids (num possible cpus). Signed-off-by: Mike Travis <travis@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:40:33 +02:00
Suresh Siddha	6ffac1e90a	x64, fpu: fix possible FPU leakage in error conditions On Thu, Jul 24, 2008 at 03:43:44PM -0700, Linus Torvalds wrote: > So how about this patch as a starting point? This is the RightThing(tm) to > do regardless, and if it then makes it easier to do some other cleanups, > we should do it first. What do you think? restore_fpu_checking() calls init_fpu() in error conditions. While this is wrong(as our main intention is to clear the fpu state of the thread), this was benign before commit `92d140e21f` ("x86: fix taking DNA during 64bit sigreturn"). Post commit `92d140e21f`, live FPU registers may not belong to this process at this error scenario. In the error condition for restore_fpu_checking() (especially during the 64bit signal return), we are doing init_fpu(), which saves the live FPU register state (possibly belonging to some other process context) into the thread struct (through unlazy_fpu() in init_fpu()). This is wrong and can leak the FPU data. For the signal handler restore error condition in restore_i387(), clear the fpu state present in the thread struct(before ultimately sending a SIGSEGV for badframe). For the paranoid error condition check in math_state_restore(), send a SIGSEGV, if we fail to restore the state. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: <stable@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:37:04 +02:00
Huang Weiyi	39eacc20f9	arch/x86/kernel/visws_quirks.c: Removed duplicated #include Removed duplicated #include in arch/x86/kernel/visws_quirks.c. asm/apic.h asm/arch_hooks.h asm/io.h asm/visws/cobalt.h asm/visws/lithium.h asm/visws/piix4.h linux/init.h linux/interrupt.h linux/smp.h Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:33:45 +02:00
Yinghai Lu	edb181ac4b	x86: mach-numaq to numaq Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:31:35 +02:00
Yinghai Lu	e8c48efdb9	x86: mach_summit to summit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:31:35 +02:00
Yinghai Lu	c7e7964c98	x86: mach_es7000 to es7000 Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:31:34 +02:00
Yinghai Lu	1176fa9192	x86: mach-bigsmp to bigsmp Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:31:34 +02:00
Yinghai Lu	a4dbc34d18	x86: add setup_ioapic_ids for numaq in x86_quirks Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:31:33 +02:00
Ingo Molnar	10d3285d0b	Merge branch 'x86/urgent' into x86/core Conflicts: include/asm-x86/gpio.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:30:19 +02:00
Ingo Molnar	6dec3a10a7	Merge branch 'x86/x2apic' into x86/core Conflicts: include/asm-x86/i8259.h include/asm-x86/msidef.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:29:23 +02:00
Joerg Roedel	3a61ec387c	x86, AMD IOMMU: include amd_iommu_last_bdf in device initialization All the values read while searching for amd_iommu_last_bdf are defined as inclusive. Let the code handle this value as such. Found by Wei Wang. Thanks Wei. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Cc: iommu@lists.linux-foundation.org Cc: bhavna.sarathy@amd.com Cc: robert.richter@amd.com Cc: Wei Wang <wei.wang2@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:45:57 +02:00
Joerg Roedel	87e39ea571	x86 gart: replace to_pages macro with iommu_num_pages This patch removes the to_pages macro from x86 GART code and calls the generic iommu_num_pages function instead. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Cc: iommu@lists.linux-foundation.org Cc: bhavna.sarathy@amd.com Cc: robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:43:06 +02:00
Joerg Roedel	a8132e5fe2	x86, AMD IOMMU: replace to_pages macro with iommu_num_pages This patch removes the to_pages macro from AMD IOMMU code and calls the generic iommu_num_pages function instead. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Cc: iommu@lists.linux-foundation.org Cc: bhavna.sarathy@amd.com Cc: robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:43:05 +02:00
Joerg Roedel	17f3ab748e	x86: convert discontig_32.c from round_up to roundup Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:39:22 +02:00
Joerg Roedel	be3e89ee6d	x86: convert numa_64.c from round_up to roundup Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:39:21 +02:00
Joerg Roedel	d86bb0dac7	x86: convert init_64.c from round_up to roundup Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:39:21 +02:00
Joerg Roedel	15ae2d76ce	x86: convert pageattr.c from round_up to roundup Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:39:20 +02:00
Joerg Roedel	1ddb551805	x86: convert pci-dma.c from round_up to roundup Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:39:20 +02:00

1 2 3 4 5 ...

3841 commits