linux

q3k/linux

Author	SHA1	Message	Date
Suresh Siddha	68a8ca593f	x86: fix broken irq migration logic while cleaning up multiple vectors Impact: fix spurious IRQs During irq migration, we send a low priority interrupt to the previous irq destination. This happens in non interrupt-remapping case after interrupt starts arriving at new destination and in interrupt-remapping case after modifying and flushing the interrupt-remapping table entry caches. This low priority irq cleanup handler can cleanup multiple vectors, as multiple irq's can be migrated at almost the same time. While there will be multiple invocations of irq cleanup handler (one cleanup IPI for each irq migration), first invocation of the cleanup handler can potentially cleanup more than one vector (as the first invocation can see the requests for more than vector cleanup). When we cleanup multiple vectors during the first invocation of the smp_irq_move_cleanup_interrupt(), other vectors that are to be cleanedup can still be pending in the local cpu's IRR (as smp_irq_move_cleanup_interrupt() runs with interrupts disabled). When we are ready to unhook a vector corresponding to an irq, check if that vector is registered in the local cpu's IRR. If so skip that cleanup and do a self IPI with the cleanup vector, so that we give a chance to service the pending vector interrupt and then cleanup that vector allocation once we execute the lowest priority handler. This fixes spurious interrupts seen when migrating multiple vectors at the same time. [ This is apparently possible even on conventional xapic, although to the best of our knowledge it has never been seen. The stable maintainers may wish to consider this one for -stable. ] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: stable@kernel.org	2009-03-17 16:49:30 -07:00
Suresh Siddha	05c3dc2c4b	x86, ioapic: Fix non atomic allocation with interrupts disabled Impact: fix possible race save_mask_IO_APIC_setup() was using non atomic memory allocation while getting called with interrupts disabled. Fix this by splitting this into two different function. Allocation part save_IO_APIC_setup() now happens before disabling interrupts. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2009-03-17 15:45:29 -07:00
Suresh Siddha	29b61be65a	x86, x2apic: cleanup ifdef CONFIG_INTR_REMAP in io_apic code Impact: cleanup Clean up #ifdefs and replace them with helper functions. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2009-03-17 15:45:07 -07:00
Suresh Siddha	0280f7c416	x86, x2apic: cleanup the IO-APIC level migration with interrupt-remapping Impact: simplification In the current code, for level triggered migration, we need to modify the io-apic RTE with the update vector information, along with modifying interrupt remapping table entry(IRTE) with vector and destination. This is to ensure that remote IRR bit inthe IOAPIC RTE gets cleared when the cpu does EOI. With this patch, for level triggered, we eliminate the io-apic RTE modification (with the updated vector information), by using a virtual vector (io-apic pin number). Real vector that is used for interrupting cpu will be coming from the interrupt-remapping table entry. Trigger mode in the IRTE will always be edge, and the actual level or edge trigger will be setup in the IO-APIC RTE. So a level triggered interrupt will appear as an edge to the local apic cpu but still as level to the IO-APIC. With this change, level irq migration can be done by simply modifying the interrupt-remapping table entry with out changing the io-apic RTE. And as the interrupt appears as edge at the cpu, in addition to do the local apic EOI, we need to do IO-APIC directed EOI to clear the remote IRR bit in the IO-APIC RTE. This simplies the irq migration in the presence of interrupt-remapping. Idea-by: Rajesh Sankaran <rajesh.sankaran@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2009-03-17 15:44:27 -07:00
Suresh Siddha	cf6567fe40	x86, x2apic: fix clear_local_APIC() in the presence of x2apic Impact: cleanup, paranoia We were not clearing the local APIC in clear_local_APIC() in the presence of x2apic. Fix it. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2009-03-17 15:43:51 -07:00
Suresh Siddha	7c6d9f9785	x86, x2apic: use virtual wire A mode in disable_IO_APIC() with interrupt-remapping Impact: make kexec work with x2apic disable_IO_APIC() gets called during crashdump aswell, which configures the IO-APIC/LAPIC so that legacy interrupts can be delivered for the kexec'd kernel. In the presence of interrupt-remapping, we need to change the interrupt-remapping configuration aswell as modifying IO-APIC for virtual wire B mode. To keep things simple during the crash, use virtual wire A mode (for which we don't need to touch io-apic and interrupt-remapping tables). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2009-03-17 15:42:28 -07:00
Suresh Siddha	9d783ba042	x86, x2apic: enable fault handling for intr-remapping Impact: interface augmentation (not yet used) Enable fault handling flow for intr-remapping aswell. Fault handling code now shared by both dma-remapping and intr-remapping. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2009-03-17 15:38:59 -07:00
Ingo Molnar	0ca0f16fd1	Merge branches 'x86/apic', 'x86/asm', 'x86/cleanups', 'x86/debug', 'x86/kconfig', 'x86/mm', 'x86/ptrace', 'x86/setup' and 'x86/urgent'; commit 'v2.6.29-rc8' into x86/core	2009-03-14 16:25:40 +01:00
Ingo Molnar	0f3fa48a7e	x86: cpu/common.c more cleanups Complete/fix the cleanups of cpu/common.c: - fix ugly warning due to asm/topology.h -> linux/topology.h change - standardize the style across the file - simplify/refactor the code flow where possible Cc: Jaswinder Singh Rajput <jaswinder@kernel.org> LKML-Reference: <1237009789.4387.2.camel@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-14 10:37:34 +01:00
Ingo Molnar	c550033ced	Merge branch 'core/percpu' into x86/core	2009-03-14 09:50:10 +01:00
Jaswinder Singh Rajput	88200bc28d	x86: entry_32.S fix compile warnings - fix work mask bit width Fix: arch/x86/kernel/entry_32.S:446: Warning: 00000000080001d1 shortened to 00000000000001d1 arch/x86/kernel/entry_32.S:457: Warning: 000000000800feff shortened to 000000000000feff arch/x86/kernel/entry_32.S:527: Warning: 00000000080001d1 shortened to 00000000000001d1 arch/x86/kernel/entry_32.S:541: Warning: 000000000800feff shortened to 000000000000feff arch/x86/kernel/entry_32.S:676: Warning: 0000000008000091 shortened to 0000000000000091 TIF_SYSCALL_FTRACE is 0x08000000 and until now we checked the first 16 bits of the work mask - bit 27 falls outside of that. Update the entry_32.S code to check the full 32-bit mask. [ %cx => %ecx fix from Cyrill Gorcunov <gorcunov@gmail.com> ] Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: "H. Peter Anvin" <hpa@kernel.org> LKML-Reference: <1237012693.18733.3.camel@ht.satnam> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-14 09:42:51 +01:00
Jaswinder Singh Rajput	9766cdbcb2	x86: cpu/common.c cleanups - fix various style problems - declare varibles before they get used - introduced clear_all_debug_regs - fix header files issues LKML-Reference: <1237009789.4387.2.camel@localhost.localdomain> Signed-off-by: Jaswinder Singh Rajput <jaswinder@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-14 08:59:50 +01:00
Américo Wang	5a8ac9d28d	x86: ptrace, bts: fix an unreachable statement Commit `c2724775ce` put a statement after return, which makes that statement unreachable. Move that statement before return. Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Roland McGrath <roland@redhat.com> Cc: Markus Metzger <markus.t.metzger@intel.com> LKML-Reference: <20090313075622.GB8933@hack> Cc: <stable@kernel.org> # .29 only Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-13 10:27:57 +01:00
Yinghai Lu	773e673de2	x86: fix e820_update_range() Impact: fix left range size on head \| commit `5c0e6f035d` \| x86: fix code paths used by update_mptable \| Impact: fix crashes under Xen due to unrobust e820 code fixes one e820 bug, but introduces another bug. Need to update size for left range at first in case it is header. also add __e820_add_region take more parameter. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: jbeulich@novell.com LKML-Reference: <49B9E286.502@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-13 05:38:29 +01:00
Jaswinder Singh Rajput	91219bcbdc	x86: cpu_debug add write support for MSRs Supported write flag for registers. currently write is enabled only for PMC MSR. [root@ht]# cat /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value 0x0 [root@ht]# echo 1234 > /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value [root@ht]# cat /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value 0x4d2 [root@ht]# echo 0x1234 > /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value [root@ht]# cat /sys/kernel/debug/x86/cpu/cpu1/pmc/0x300/value 0x1234 Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-13 03:02:45 +01:00
Jan Beulich	5c0e6f035d	x86: fix code paths used by update_mptable Impact: fix crashes under Xen due to unrobust e820 code find_e820_area_size() must return a properly distinguishable and out-of-bounds value when it fails, and -1UL does not meet that criteria on i386/PAE. Additionally, callers of the function must check against that value. early_reserve_e820() should be prepared for the region found to be outside of the addressable range on 32-bits. e820_update_range_map() should not blindly update e820, but should do all it work on the map it got a pointer passed for (which in 50% of the cases is &e820_saved). It must also not call e820_add_region(), as that again acts on e820 unconditionally. The issues were found when trying to make this option work in our Xen kernel (i.e. where some of the silent assumptions made in the code would not hold). Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <49B9171B.76E4.0078.0@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-13 02:37:19 +01:00
Jan Beulich	82034d6f59	x86: clean up output resulting from update_mptable option Impact: cleanup Without apic=verbose, using the update_mptable option would result in garbled and confusing output due to the inconsistent use of printk() vs apic_printk(). Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <49B914B6.76E4.0078.0@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-13 02:37:19 +01:00
Jan Beulich	9a50156a1c	x86: properly __init-annotate recent early_printk additions Impact: cleanup, save memory Don't keep code resident that's only needed during startup. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <49B91103.76E4.0078.0@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-13 02:37:18 +01:00
Jan Beulich	13c6c53282	x86, 32-bit: also use cpuinfo_x86's x86_{phys,virt}_bits members Impact: 32/64-bit consolidation In a first step, this allows fixing phys_addr_valid() for PAE (which until now reported all addresses to be valid). Subsequently, this will also allow simplifying some MTRR handling code. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <49B9101E.76E4.0078.0@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-13 02:37:17 +01:00
Jan Beulich	7a81d9a7da	x86: smarten /proc/interrupts output Impact: change /proc/interrupts output ABI With the number of interrupts on large systems growing, assumptions on the width an interrupt number requires when converted to a decimal string turn invalid. Therefore, calculate the maximum number of digits dynamically. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <49B911EB.76E4.0078.0@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-13 02:36:52 +01:00
Jan Beulich	02dde8b45c	x86: move various CPU initialization objects into .cpuinit.rodata Impact: debuggability and micro-optimization Putting whatever is possible into the (final) .rodata section increases the likelihood of catching memory corruption bugs early, and reduces false cache line sharing. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <49B90961.76E4.0078.0@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-12 13:13:07 +01:00
Jan Beulich	c2810188c1	x86-64: move save_paranoid into .kprobes.text Impact: mark save_paranoid as non-kprobe-able code This appears to be necessary as the function gets called from kprobes-unsafe exception handling stubs (i.e. which themselves live in .kprobes.text). Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <49B8F44F.76E4.0078.0@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-12 11:57:46 +01:00
Jan Beulich	9fa7266c2f	x86: remove leftover unwind annotations Impact: cleanup These got left in needlessly when ret_from_fork got simplified. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <49B8F355.76E4.0078.0@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-12 11:50:39 +01:00
Ingo Molnar	a98fe7f342	Merge branches 'x86/asm', 'x86/debug', 'x86/mm', 'x86/setup', 'x86/urgent' and 'linus' into x86/core	2009-03-12 11:50:15 +01:00
Jaswinder Singh Rajput	8229d75438	x86: cpu architecture debug code, build fix, cleanup move store_ldt outside the CONFIG_PARAVIRT section and also clean up the code a bit. Signed-off-by: Jaswinder Singh Rajput <jaswinder@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-11 14:52:03 +01:00
Ingo Molnar	78b020d035	Merge branches 'x86/cleanups', 'x86/kexec', 'x86/mce2' and 'linus' into x86/core	2009-03-11 10:49:15 +01:00
Ingo Molnar	65a37b29a8	Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu	2009-03-11 10:30:23 +01:00
Ingo Molnar	1d8ce7bc4d	Merge branch 'linus' into core/percpu Conflicts: arch/x86/include/asm/fixmap_64.h	2009-03-11 10:29:28 +01:00
Thomas Gleixner	bf5172d07a	x86: convert obsolete irq_desc_t typedef to struct irq_desc Impact: cleanup Convert the last remaining users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-03-11 09:49:01 +01:00
KOSAKI Motohiro	5490fa9673	x86, mce: use round_jiffies() instead round_jiffies_relative() Impact: saving power _very_ little round_jiffies() round up absolute jiffies to full second. round_jiffies_relative() round up relative jiffies to full second. The "t->expires" is absolute jiffies. Then, round_jiffies() should be used instead round_jiffies_relative(). Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-03-10 22:33:06 -07:00
Huang Ying	fee7b0d84c	x86, kexec: x86_64: add kexec jump support for x86_64 Impact: New major feature This patch add kexec jump support for x86_64. More information about kexec jump can be found in corresponding x86_32 support patch. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-03-10 18:13:25 -07:00
Huang Ying	5359454701	x86, kexec: x86_64: add identity map for pages at image->start Impact: Fix corner case that cannot yet occur image->start may be outside of 0 ~ max_pfn, for example when jumping back to original kernel from kexeced kenrel. This patch add identity map for pages at image->start. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-03-10 18:13:25 -07:00
Huang Ying	fef3a7a174	x86, kexec: fix kexec x86 coding style Impact: Cleanup Fix some coding style issue for kexec x86. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-03-10 18:13:25 -07:00
Jaswinder Singh Rajput	9b779edf4b	x86: cpu architecture debug code Introduce: cat /sys/kernel/debug/x86/cpu/* for Intel and AMD processors to view / debug the state of each CPU. By using this we can debug whole range of registers and other cpu information for debugging purpose and monitor how things are changing. This can be useful for developers as well as for users. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> LKML-Reference: <1236701373.3387.4.camel@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-10 18:39:45 +01:00
Stoyan Gaydarov	8c5dfd2551	x86: BUG to BUG_ON changes Impact: cleanup Signed-off-by: Stoyan Gaydarov <stoyboyker@gmail.com> LKML-Reference: <1236661850-8237-8-git-send-email-stoyboyker@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-10 09:55:18 +01:00
Ingo Molnar	467c88fee5	Merge branches 'x86/apic', 'x86/asm', 'x86/fixmap', 'x86/memtest', 'x86/mm', 'x86/urgent', 'linus' and 'core/percpu' into x86/core	2009-03-10 09:26:38 +01:00
Tejun Heo	66c3a75772	percpu: generalize embedding first chunk setup helper Impact: code reorganization Separate out embedding first chunk setup helper from x86 embedding first chunk allocator and put it in mm/percpu.c. This will be used by the default percpu first chunk allocator and possibly by other archs. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-03-10 16:27:48 +09:00
Tejun Heo	6074d5b0a3	percpu: more flexibility for @dyn_size of pcpu_setup_first_chunk() Impact: cleanup, more flexibility for first chunk init Non-negative @dyn_size used to be allowed iff @unit_size wasn't auto. This restriction stemmed from implementation detail and made things a bit less intuitive. This patch allows @dyn_size to be specified regardless of @unit_size and swaps the positions of @dyn_size and @unit_size so that the parameter order makes more sense (static, reserved and dyn sizes followed by enclosing unit_size). While at it, add @unit_size >= PCPU_MIN_UNIT_SIZE sanity check. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-03-10 16:27:48 +09:00
Linus Torvalds	99adcd9d67	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Add p4-clockmod sysfs-ui removal to feature-removal schedule. Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."	2009-03-09 13:23:59 -07:00
Dave Jones	129f8ae9b1	Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod." This reverts commit `e088e4c9cd`. Removing the sysfs interface for p4-clockmod was flagged as a regression in bug 12826. Course of action: - Find out the remaining causes of overheating, and fix them if possible. ACPI should be doing the right thing automatically. If it isn't, we need to fix that. - mark p4-clockmod ui as deprecated - try again with the removal in six months. It's not really feasible to printk about the deprecation, because it needs to happen at all the sysfs entry points, which means adding a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch. Signed-off-by: Dave Jones <davej@redhat.com>	2009-03-09 15:07:33 -04:00
Yinghai Lu	1f442d70c8	x86: remove smp_apply_quirks()/smp_checks() Impact: cleanup and code size reduction on 64-bit This code is only applied to Intel Pentium and AMD K7 32-bit cpus. Move those checks to intel_init()/amd_init() for 32-bit so 64-bit will not build this code. Also change to use cpu_index check to see if we need to emit warning. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <49B377D2.8030108@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-08 16:22:56 +01:00
Cliff Wickman	3a450de136	x86: UV: remove uv_flush_tlb_others() WARN_ON In uv_flush_tlb_others() (arch/x86/kernel/tlb_uv.c), the "WARN_ON(!in_atomic())" fails if CONFIG_PREEMPT is not enabled. And CONFIG_PREEMPT is not enabled by default in the distribution that most UV owners will use. We could #ifdef CONFIG_PREEMPT the warning, but that is not good form. And there seems to be no suitable fix to in_atomic() when CONFIG_PREMPT is not on. As Ingo commented: > and we have no proper primitive to test for atomicity. (mainly > because we dont know about atomicity on a non-preempt kernel) So we drop the WARN_ON. Signed-off-by: Cliff Wickman <cpw@sgi.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-08 11:17:15 +01:00
Markus Metzger	73bf1b62f5	x86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs() ds_write_config() can write the BTS as well as the PEBS part of the DS config. ds_request_pebs() passes the wrong qualifier, which results in the wrong configuration to be written. Reported-by: Stephane Eranian <eranian@googlemail.com> Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> LKML-Reference: <20090305085721.A22550@sedona.ch.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-06 16:13:15 +01:00
Markus Metzger	9ca0791dca	x86, bts: remove bad warning In case a ptraced task is reaped (while the tracer is still attached), ds_exit_thread() is called before ptrace_exit(). The latter will release the bts_tracer and remove the thread's ds_ctx. The former will WARN() if the context is not NULL. Oleg Nesterov submitted patches that move ptrace_exit() before exit_thread() and thus reverse the order of the above calls. Remove the bad warning. I will add it again when Oleg's changes are in. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> LKML-Reference: <20090305084954.A22000@sedona.ch.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-06 16:13:15 +01:00
Tejun Heo	6b19b0c240	x86, percpu: setup reserved percpu area for x86_64 Impact: fix relocation overflow during module load x86_64 uses 32bit relocations for symbol access and static percpu symbols whether in core or modules must be inside 2GB of the percpu segement base which the dynamic percpu allocator doesn't guarantee. This patch makes x86_64 reserve PERCPU_MODULE_RESERVE bytes in the first chunk so that module percpu areas are always allocated from the first chunk which is always inside the relocatable range. This problem exists for any percpu allocator but is easily triggered when using the embedding allocator because the second chunk is located beyond 2GB on it. This patch also changes the meaning of PERCPU_DYNAMIC_RESERVE such that it only indicates the size of the area to reserve for dynamic allocation as static and dynamic areas can be separate. New PERCPU_DYNAMIC_RESERVED is increased by 4k for both 32 and 64bits as the reserved area separation eats away some allocatable space and having slightly more headroom (currently between 4 and 8k after minimal boot sans module area) makes sense for common case performance. x86_32 can address anywhere from anywhere and doesn't need reserving. Mike Galbraith first reported the problem first and bisected it to the embedding percpu allocator commit. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Mike Galbraith <efault@gmx.de> Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>	2009-03-06 14:33:59 +09:00
Tejun Heo	edcb463997	percpu, module: implement reserved allocation and use it for module percpu variables Impact: add reserved allocation functionality and use it for module percpu variables This patch implements reserved allocation from the first chunk. When setting up the first chunk, arch can ask to set aside certain number of bytes right after the core static area which is available only through a separate reserved allocator. This will be used primarily for module static percpu variables on architectures with limited relocation range to ensure that the module perpcu symbols are inside the relocatable range. If reserved area is requested, the first chunk becomes reserved and isn't available for regular allocation. If the first chunk also includes piggy-back dynamic allocation area, a separate chunk mapping the same region is created to serve dynamic allocation. The first one is called static first chunk and the second dynamic first chunk. Although they share the page map, their different area map initializations guarantee they serve disjoint areas according to their purposes. If arch doesn't setup reserved area, reserved allocation is handled like any other allocation. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-03-06 14:33:59 +09:00
Tejun Heo	9a4f8a878b	x86: make embedding percpu allocator return excessive free space Impact: reduce unnecessary memory usage on certain configurations Embedding percpu allocator allocates unit_size * smp_num_possible_cpus() bytes consecutively and use it for the first chunk. However, if the static area is small, this can result in excessive prellocated free space in the first chunk due to PCPU_MIN_UNIT_SIZE restriction. This patch makes embedding percpu allocator preallocate only what's necessary as described by PERPCU_DYNAMIC_RESERVE and return the leftover to the bootmem allocator. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-03-06 14:33:59 +09:00
Tejun Heo	cafe8816b2	percpu: use negative for auto for pcpu_setup_first_chunk() arguments Impact: argument semantic cleanup In pcpu_setup_first_chunk(), zero @unit_size and @dyn_size meant auto-sizing. It's okay for @unit_size as 0 doesn't make sense but 0 dynamic reserve size is valid. Alos, if arch @dyn_size is calculated from other parameters, it might end up passing in 0 @dyn_size and malfunction when the size is automatically adjusted. This patch makes both @unit_size and @dyn_size ssize_t and use -1 for auto sizing. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-03-06 14:33:59 +09:00
Ingo Molnar	31bbed527e	Merge branch 'x86/uv' into x86/core	2009-03-05 21:49:47 +01:00
Ingo Molnar	28e93a005b	Merge branch 'x86/mm' into x86/core	2009-03-05 21:49:35 +01:00

1 2 3 4 5 ...

4060 commits