Commit Graph

401 Commits (6b34350f490b2c8508717541ec1fd2bbaadded94)

Author SHA1 Message Date
Andi Kleen 50895c5d76 [PATCH] x86_64: Fix gcc 4 warning in aperture.c
Fix

  arch/x86_64/kernel/aperture.c: In function #iommu_hole_init#:
  arch/x86_64/kernel/aperture.c:199: warning: #aper_order# may be used uninitialized in this function

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
Suresh Siddha f5f786d045 [PATCH] x86-64/i386: Fix CPU model for family 6
According to cpuid instruction in IA32 SDM-Vol2, when computing cpu model,
we need to consider extended model ID for family 0x6 also.

AK: Also added fixes/simplifcation from Petr Vandrovec

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
Ashok Raj e9b59d834f [PATCH] x86_64: Remove duplicate __cpuinit define
Remove duplicate __cpuinit in smp.c. Already defined in init.h which is
already included.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
Andi Kleen 47492d3667 [PATCH] x86_64: Use the DMA32 zone for dma_alloc_coherent()/pci_alloc_consistent
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
James Cleverdon 6004e1b7ef [PATCH] i386/x86-64: Share interrupt vectors when there is a large number of interrupt sources
Here's a patch that builds on Natalie Protasevich's IRQ compression
patch and tries to work for MPS boots as well as ACPI.  It is meant for
a 4-node IBM x460 NUMA box, which was dying because it had interrupt
pins with GSI numbers > NR_IRQS and thus overflowed irq_desc.

The problem is that this system has 270 GSIs (which are 1:1 mapped with
I/O APIC RTEs) and an 8-node box would have 540.  This is much bigger
than NR_IRQS (224 for both i386 and x86_64).  Also, there aren't enough
vectors to go around.  There are about 190 usable vectors, not counting
the reserved ones and the unused vectors at 0x20 to 0x2F.  So, my patch
attempts to compress the GSI range and share vectors by sharing IRQs.

Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com>

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Jacob Shin 89b831ef8b [PATCH] x86_64: Support for AMD specific MCE Threshold.
MC4_MISC - DRAM Errors Threshold Register realized under AMD K8 Rev F.
This register is used to count correctable and uncorrectable ECC errors that occur during DRAM read operations.
The user may interface through sysfs files in order to change the threshold configuration.

bank%d/error_count - reads current error count, write to clear.
bank%d/interrupt_enable - set/clear interrupt enable.
bank%d/threshold_limit - read/write the threshold limit.

APIC vector 0xF9 in hw_irq.h.
5 software defined bank ids in mce.h.
new apic.c function to setup threshold apic lvt.
defaults to interrupt off, count enabled, and threshold limit max.
sysfs interface created on /sys/devices/system/threshold.

AK: added some ifdefs to make it compile on UP

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Andi Kleen e18c6874a5 [PATCH] x86_64: Account mem_map in VM holes accounting
The VM needs to know about lost memory in zones to accurately
balance dirty pages. This patch accounts mem_map in there too,
which fixes a constant errror of a few percent. Also some
other misc mappings and the kernel text itself are accounted
too.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Andi Kleen a2f1b42490 [PATCH] x86_64: Add 4GB DMA32 zone
Add a new 4GB GFP_DMA32 zone between the GFP_DMA and GFP_NORMAL zones.

As a bit of historical background: when the x86-64 port
was originally designed we had some discussion if we should
use a 16MB DMA zone like i386 or a 4GB DMA zone like IA64 or
both. Both was ruled out at this point because it was in early
2.4 when VM is still quite shakey and had bad troubles even
dealing with one DMA zone.  We settled on the 16MB DMA zone mainly
because we worried about older soundcards and the floppy.

But this has always caused problems since then because
device drivers had trouble getting enough DMA able memory. These days
the VM works much better and the wide use of NUMA has proven
it can deal with many zones successfully.

So this patch adds both zones.

This helps drivers who need a lot of memory below 4GB because
their hardware is not accessing more (graphic drivers - proprietary
and free ones, video frame buffer drivers, sound drivers etc.).
Previously they could only use IOMMU+16MB GFP_DMA, which
was not enough memory.

Another common problem is that hardware who has full memory
addressing for >4GB misses it for some control structures in memory
(like transmit rings or other metadata).  They tended to allocate memory
in the 16MB GFP_DMA or the IOMMU/swiotlb then using pci_alloc_consistent,
but that can tie up a lot of precious 16MB GFPDMA/IOMMU/swiotlb memory
(even on AMD systems the IOMMU tends to be quite small) especially if you have
many devices.  With the new zone pci_alloc_consistent can just put
this stuff into memory below 4GB which works better.

One argument was still if the zone should be 4GB or 2GB. The main
motivation for 2GB would be an unnamed not so unpopular hardware
raid controller (mostly found in older machines from a particular four letter
company) who has a strange 2GB restriction in firmware. But
that one works ok with swiotlb/IOMMU anyways, so it doesn't really
need GFP_DMA32. I chose 4GB to be compatible with IA64 and because
it seems to be the most common restriction.

The new zone is so far added only for x86-64.

For other architectures who don't set up this
new zone nothing changes. Architectures can set a compatibility
define in Kconfig CONFIG_DMA_IS_DMA32 that will define GFP_DMA32
as GFP_DMA. Otherwise it's a nop because on 32bit architectures
it's normally not needed because GFP_NORMAL (=0) is DMA able
enough.

One problem is still that GFP_DMA means different things on different
architectures. e.g. some drivers used to have #ifdef ia64  use GFP_DMA
(trusting it to be 4GB) #elif __x86_64__ (use other hacks like
the swiotlb because 16MB is not enough) ... . This was quite
ugly and is now obsolete.

These should be now converted to use GFP_DMA32 unconditionally. I haven't done
this yet. Or best only use pci_alloc_consistent/dma_alloc_coherent
which will use GFP_DMA32 transparently.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Andi Kleen 56720367cd [PATCH] x86_64: Update defconfig
Rerun and enable autofs 4, relayfs and softdog

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:12 -08:00
Karsten Wiese d6c7ac081b [PATCH] x86_64 two timer entries in /sys
attached patch renames one instance of
	/sys/devices/system/timer
to
	/sys/devices/system/timer_pit
to avoid a name clash with another instance created in time.c.

Acked-by: Andi Kleen <ak@muc.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-13 18:14:10 -08:00
Nick Piggin 64c7c8f885 [PATCH] sched: resched and cpu_idle rework
Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce
confusion, and make their semantics rigid.  Improves efficiency of
resched_task and some cpu_idle routines.

* In resched_task:
- TIF_NEED_RESCHED is only cleared with the task's runqueue lock held,
  and as we hold it during resched_task, then there is no need for an
  atomic test and set there. The only other time this should be set is
  when the task's quantum expires, in the timer interrupt - this is
  protected against because the rq lock is irq-safe.

- If TIF_NEED_RESCHED is set, then we don't need to do anything. It
  won't get unset until the task get's schedule()d off.

- If we are running on the same CPU as the task we resched, then set
  TIF_NEED_RESCHED and no further action is required.

- If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set
  after TIF_NEED_RESCHED has been set, then we need to send an IPI.

Using these rules, we are able to remove the test and set operation in
resched_task, and make clear the previously vague semantics of
POLLING_NRFLAG.

* In idle routines:
- Enter cpu_idle with preempt disabled. When the need_resched() condition
  becomes true, explicitly call schedule(). This makes things a bit clearer
  (IMO), but haven't updated all architectures yet.

- Many do a test and clear of TIF_NEED_RESCHED for some reason. According
  to the resched_task rules, this isn't needed (and actually breaks the
  assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock
  held). So remove that. Generally one less locked memory op when switching
  to the idle thread.

- Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner
  most polling idle loops. The above resched_task semantics allow it to be
  set until before the last time need_resched() is checked before going into
  a halt requiring interrupt wakeup.

  Many idle routines simply never enter such a halt, and so POLLING_NRFLAG
  can be always left set, completely eliminating resched IPIs when rescheduling
  the idle task.

  POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:33 -08:00
Nick Piggin 5bfb5d690f [PATCH] sched: disable preempt in idle tasks
Run idle threads with preempt disabled.

Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()).
How did it ever work before?

Might fix the CPU hotplugging hang which Nigel Cunningham noted.

We think the bug hits if the idle thread is preempted after checking
need_resched() and before going to sleep, then the CPU offlined.

After calling stop_machine_run, the CPU eventually returns from preemption and
into the idle thread and goes to sleep.  The CPU will continue executing
previous idle and have no chance to call play_dead.

By disabling preemption until we are ready to explicitly schedule, this bug is
fixed and the idle threads generally become more robust.

From: alexs <ashepard@u.washington.edu>

  PPC build fix

From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>

  MIPS build fix

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:33 -08:00
Christoph Hellwig b05a581d48 [PATCH] move some COMPATIBLE_IOCTL entries from x86_64 to common code
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:01 -08:00
Adrian Bunk 5fed0578be [PATCH] unexport phys_proc_id and cpu_core_id
EXPORT_SYMBOL's for phys_proc_id and cpu_core_id were added this year but
never used.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:54:09 -08:00
Ananth N Mavinakayanahalli d217d5450f [PATCH] Kprobes: preempt_disable/enable() simplification
Reorganize the preempt_disable/enable calls to eliminate the extra preempt
depth.  Changes based on Paul McKenney's review suggestions for the kprobes
RCU changeset.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli 991a51d83a [PATCH] Kprobes: Use RCU for (un)register synchronization - arch changes
Changes to the arch kprobes infrastructure to take advantage of the locking
changes introduced by usage of RCU for synchronization.  All handlers are now
run without any locks held, so they have to be re-entrant or provide their own
synchronization.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli e7a510f92c [PATCH] Kprobes: Track kprobe on a per_cpu basis - x86_64 changes
x86_64 changes to track kprobe execution on a per-cpu basis.  We now track the
kprobe state machine independently on each cpu using a arch specific kprobe
control block.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli 66ff2d0691 [PATCH] Kprobes: rearrange preempt_disable/enable() calls
The following set of patches are aimed at improving kprobes scalability.  We
currently serialize kprobe registration, unregistration and handler execution
using a single spinlock - kprobe_lock.

With these changes, kprobe handlers can run without any locks held.  It also
allows for simultaneous kprobe handler executions on different processors as
we now track kprobe execution on a per processor basis.  It is now necessary
that the handlers be re-entrant since handlers can run concurrently on
multiple processors.

All changes have been tested on i386, ia64, ppc64 and x86_64, while sparc64
has been compile tested only.

The patches can be viewed as 3 logical chunks:

patch 1: 	Reorder preempt_(dis/en)able calls
patches 2-7: 	Introduce per_cpu data areas to track kprobe execution
patches 8-9: 	Use RCU to synchronize kprobe (un)registration and handler
		execution.

Thanks to Maneesh Soni, James Keniston and Anil Keshavamurthy for their
review and suggestions. Thanks again to Anil, Hien Nguyen and Kevin Stafford
for testing the patches.

This patch:

Reorder preempt_disable/enable() calls in arch kprobes files in preparation to
introduce locking changes.  No functional changes introduced by this patch.

Signed-off-by: Ananth N Mavinakayahanalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:45 -08:00
Christoph Hellwig 481bed4542 [PATCH] consolidate sys_ptrace()
The sys_ptrace boilerplate code (everything outside the big switch
statement for the arch-specific requests) is shared by most architectures.
This patch moves it to kernel/ptrace.c and leaves the arch-specific code as
arch_ptrace.

Some architectures have a too different ptrace so we have to exclude them.
They continue to keep their implementations.  For sh64 I had to add a
sh64_ptrace wrapper because it does some initialization on the first call.
For um I removed an ifdefed SUBARCH_PTRACE_SPECIAL block, but
SUBARCH_PTRACE_SPECIAL isn't defined anywhere in the tree.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-By: David Howells <dhowells@redhat.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:42 -08:00
Prasanna S Panchamukhi cd6b0762a0 [PATCH] Move Kprobes and Oprofile to "Instrumentation Support" menu
Andrew Morton suggested to move kprobes from kernel hacking menu, since
kernel hacking menu is in-appropriate for the Kprobes.  This patch moves
Kprobes and Oprofile under instrumentation menu.

(akpm: it's not a natural fit, but things like djprobes and the s390 guys'
statistics library need a home)

Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Philippe Elie <phil.el@wanadoo.fr>
Cc: John Levon <levon@movementarian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:35 -08:00
Alexandre Oliva 06024f217d [PATCH] x86-64: bitops fix for -Os
This fixes the x86-64 find_[first|next]_zero_bit() function for the
end-of-range case.  It didn't test for a zero size, and the "rep scas"
would do entirely the wrong thing.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-02 19:41:32 -08:00
Tony Luck c7fb577e2a manual update from upstream:
Applied Al's change 06a544971f
to new location of swiotlb.c

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-10-31 10:51:57 -08:00
Clemens Ladisch 7811fb8f40 [PATCH] hpet-RTC: cache the comparator register
Reads from an HPET register require a round trip to the south bridge and are
almost as slow as PCI reads.  By caching the last value we've written to the
comparator register, we can eliminate all HPET reads from the fast path in the
emulated RTC interrupt handler.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:30 -08:00
Clemens Ladisch 5f819949ee [PATCH] hpet-RTC: fix timer config register accesses
Make sure that the RTC timer is in non-periodic mode; some stupid BIOS might
have initialized it to periodic mode.

Furthermore, don't set the SETVAL bit in the config register.  This wouldn't
have any effect unless the timer was in period mode (which it isn't), and then
the actual timer frequency would be half that of the desired one because
incrementing the comparator in the interrupt handler would be done after the
hardware has already incremented it itself.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:29 -08:00
Clemens Ladisch f00c96f313 [PATCH] hpet-RTC: disable interrupt when no longer needed
When the emulated RTC interrupt is no longer needed, we better disable it;
otherwise, we get a spurious interrupt whenever the timer has rolled over and
reaches the same comparator value.

Having a superfluous interrupt every five minutes doesn't hurt much, but it's
bad style anyway.  ;-)

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:29 -08:00
Thomas Gleixner ecea8d19c9 [PATCH] jiffies_64 cleanup
Define jiffies_64 in kernel/timer.c rather than having 24 duplicated
defines in each architecture.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:25 -08:00
Brian Gerst 371e8c25b6 [PATCH] Remove orphaned TIOCGDEV compat ioctl
This ioctl doesn't exist for native i386.

Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:25 -08:00
Oleg Nesterov a8db2db1e6 [PATCH] introduce setup_timer() helper
Every user of init_timer() also needs to initialize ->function and ->data
fields.  This patch adds a simple setup_timer() helper for that.

The schedule_timeout() is patched as an example of usage.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:17 -08:00
Rafael J. Wysocki 2c1b4a5ca4 [PATCH] swsusp: rework memory freeing on resume
The following patch makes swsusp use the PG_nosave and PG_nosave_free flags to
mark pages that should be freed in case of an error during resume.

This allows us to simplify the code and to use swsusp_free() in all of the
swsusp's resume error paths, which makes them actually work.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:14 -08:00
Brian Gerst c531178157 [PATCH] Clean up mtrr compat ioctl code
Handle 32-bit mtrr ioctls in the mtrr driver instead of the ia32
compatability layer.

Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:13 -08:00
Kamble, Nitin A daedb82d6b [PATCH] x86: vmx cpu feature detection
If VMX feature is available in the CPU, this patch will make it visible in
the /proc/cpuinfo with the cpuid detection.

Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:13 -08:00
Shaohua Li 08967f941a [PATCH] FPU context corrupted after resume
mxcsr_feature_mask_init isn't needed in suspend/resume time (we can use
boot time mask).  And actually it's harmful, as it clear task's saved
fxsave in resume.  This bug is widely seen by users using zsh.

(akpm: my eyes.  Fixed some surrounding whitespace mess)

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:11 -08:00
Mathieu Desnoyers dacb16b1a0 [PATCH] i386 and x86_64 TSC set_cyc2ns_scale imprecision
I just found out that some precision is unnecessarily lost in the
arch/i386/kernel/timers/timer_tsc.c:set_cyc2ns_scale function.  It uses a
cpu_mhz parameter when it could use a cpu_khz.  In the specific case of an
Intel P4 running at 3001.171 Mhz, the truncation to 3001 Mhz leads to an
imprecision of 19 microseconds per second : this is very sad for a timer with
nearly nanosecond accuracy.

Fix the x86_64 architecture too.

Cc: george anzinger <george@mvista.com>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:11 -08:00
Hugh Dickins 872fec16d9 [PATCH] mm: init_mm without ptlock
First step in pushing down the page_table_lock.  init_mm.page_table_lock has
been used throughout the architectures (usually for ioremap): not to serialize
kernel address space allocation (that's usually vmlist_lock), but because
pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.

Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
and drop it when allocating a new one, to check lest a racing task already
did.  Similarly no page_table_lock in vmalloc's map_vm_area.

Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
user mms, which are converted only by a later patch, for now they have to lock
differently according to whether or not it's init_mm.

If sources get muddled, there's a danger that an arch source taking
init_mm.page_table_lock will be mixed with common source also taking it (or
neither take it).  So break the rules and make another change, which should
break the build for such a mismatch: remove the redundant mm arg from
pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).

Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
took page_table_lock for no good reason.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
Hugh Dickins 404351e67a [PATCH] mm: mm_init set_mm_counters
How is anon_rss initialized?  In dup_mmap, and by mm_alloc's memset; but
that's not so good if an mm_counter_t is a special type.  And how is rss
initialized?  By set_mm_counter, all over the place.  Come on, we just need to
initialize them both at once by set_mm_counter in mm_init (which follows the
memcpy when forking).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:38 -07:00
Al Viro f80aabb03a [PATCH] gfp_t: dma-mapping (amd64)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:48 -07:00
Tony Luck 9cec58dc13 Update from upstream with manual merge of Yasunori Goto's
changes to swiotlb.c made in commit 281dd25cdc
since this file has been moved from arch/ia64/lib/swiotlb.c to
lib/swiotlb.c

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-10-20 10:41:44 -07:00
Andi Kleen 421c7ce6d0 [PATCH] x86_64: Allocate cpu local data for all possible CPUs
CPU hotplug fills up the possible map to NR_CPUs, but it did that after
setting up per CPU data. This lead to CPU data not getting allocated
for all possible CPUs, which lead to various side effects.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-10 16:33:25 -07:00
Andi Kleen 094804c5a1 [PATCH] x86_64: Fix change_page_attr cache flushing
Noticed by Terence Ripperda

Undo wrong change in global_flush_tlb. We need to flush the caches in all
cases, not just when pages were reverted. This was a bogus optimization
added earlier, but it was wrong.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-10 16:10:33 -07:00
Markus F.X.J. Oberhumer d347f37227 [PATCH] i386: fix stack alignment for signal handlers
This fixes the setup of the alignment of the signal frame, so that all
signal handlers are run with a properly aligned stack frame.

The current code "over-aligns" the stack pointer so that the stack frame
is effectively always mis-aligned by 4 bytes.  But what we really want
is that on function entry ((sp + 4) & 15) == 0, which matches what would
happen if the stack were aligned before a "call" instruction.

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-10 08:45:06 -07:00
Rafael J. Wysocki 3dd083255d [PATCH] x86_64: Set up safe page tables during resume
The following patch makes swsusp avoid the possible temporary corruption
of page translation tables during resume on x86-64.  This is achieved by
creating a copy of the relevant page tables that will not be modified by
swsusp and can be safely used by it on resume.

The problem is that during resume on x86-64 swsusp may temporarily
corrupt the page tables used for the direct mapping of RAM.  If that
happens, a page fault occurs and cannot be handled properly, which leads
to the solid hang of the affected system.  This leads to the loss of the
system's state from before suspend and may result in the loss of data or
the corruption of filesystems, so it is a serious issue.  Also, it
appears to happen quite often (for me, as often as 50% of the time).

The problem is related to the fact that (at least) one of the PMD
entries used in the direct memory mapping (starting at PAGE_OFFSET)
points to a page table the physical address of which is much greater
than the physical address of the PMD entry itself.  Moreover,
unfortunately, the physical address of the page table before suspend
(i.e.  the one stored in the suspend image) happens to be different to
the physical address of the corresponding page table used during resume
(i.e.  the one that is valid right before swsusp_arch_resume() in
arch/x86_64/kernel/suspend_asm.S is executed).  Thus while the image is
restored, the "offending" PMD entry gets overwritten, so it does not
point to the right physical address any more (i.e.  there's no page
table at the address pointed to by it, because it points to the address
the page table has been at during suspend).  Consequently, if the PMD
entry is used later on, and it _is_ used in the process of copying the
image pages, a page fault occurs, but it cannot be handled in the normal
way and the system hangs.

In principle we can call create_resume_mapping() from
swsusp_arch_resume() (ie.  from suspend_asm.S), but then the memory
allocations in create_resume_mapping(), resume_pud_mapping(), and
resume_pmd_mapping() must be made carefully so that we use _only_
NosaveFree pages in them (the other pages are overwritten by the loop in
swsusp_arch_resume()).  Additionally, we are in atomic context at that
time, so we cannot use GFP_KERNEL.  Moreover, if one of the allocations
fails, we should free all of the allocated pages, so we need to trace
them somehow.

All of this is done in the appended patch, except that the functions
populating the page tables are located in arch/x86_64/kernel/suspend.c
rather than in init.c.  It may be done in a more elegan way in the
future, with the help of some swsusp patches that are in the works now.

[AK: move some externs into headers, renamed a function]

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-10 08:36:46 -07:00
Andi Kleen 944d2647dd [PATCH] x86_64: Drop global bit from early low mappings
Drop global bit from early low mappings

Suggested by Linus, originally also proposed by Suresh.

This fixes a race condition with early start of udev, originally
tracked down by Suresh B. Siddha. The problem was that switching
to the user space VM would not clear the global low mappings
for the beginning of memory, which lead to memory corruption.

Drop the global bits.

The kernel mapping stays global because it should stay constant.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-04 15:56:52 -07:00
Ravikiran G Thirumalai ddea7be0ec [PATCH] x86_64: Fix numa node topology detection for srat based x86_64 boxes
2.6.14-rc2 does not assign cpus to proper nodeids on our em64t numa boxen.
Our boxes use acpi srat for parsing the numa information.

srat_detect_node() used phys_proc_id[] to get to the cpu's local apic id,
but phys_proc_id[] represents the cpu<->initial_apic_id mapping.  The
following patch fixes this problem.  Now apicid_to_node[] is properly
indexed with the local apic id.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-03 10:54:22 -07:00
Ravikiran G Thirumalai 85cc5135ac [PATCH] x86_64 early numa init fix
The tests Alok carried out on Petr's box confirmed that cpu_to_node[BP] is
not setup early enough by numa_init_array due to the x86_64 changes in
2.6.14-rc*, and unfortunately set wrongly by the work around code in
numa_init_array().  cpu_to_node[0] gets set with 1 early and later gets set
properly to 0 during identify_cpu() when all cpus are brought up, but
confusing the numa slab in the process.

Here is a quick fix for this.  The right fix obviously is to have
cpu_to_node[bsp] setup early for numa_init_array().  The following patch
will fix the problem now, and the code can stay on even when
cpu_to_node{BP] gets fixed early correctly.

Thanks to Petr for access to his box.

Signed off by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30 12:41:20 -07:00
Ravikiran G Thirumalai e6a045a5b8 [PATCH] x86_64: fix the BP node_to_cpumask
Fix the BP node_to_cpumask.  2.6.14-rc* broke the boot cpu bit as the
cpu_to_node(0) is now not setup early enough for numa_init_array.
cpu_to_node[] is setup much later at srat_detect_node on acpi srat based
em64t machines.  This seems like a problem on amd machines too, Tested on
em64t though.  /sys/devices/system/node/node0/cpumap shows up sanely after
this patch.

Signed off by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30 12:41:20 -07:00
Zhang, Yanmin 2dd960d66b [PATCH] utilization of kprobe_mutex is incorrect on x86_64
The up()/down() orders are incorrect in arch/x86_64/kprobes.c file.
kprobe_mutext is used to protect the free kprobe instruction slot list.
arch_prepare_kprobe applies for a slot from the free list, and
arch_remove_kprobe returns a slot to the free list.  The incorrect up()/down()
orders to operate on kprobe_mutex fail to protect the free list.  If 2 threads
try to get/return kprobe instruction slot at the same time, the free slot list
might be broken, or a free slot might be applied by 2 threads.

Signed-off-by: Zhang Yanmin <Yanmin.zhang@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30 12:41:20 -07:00
Mike Waychison 7644143cd6 [PATCH] x86_64: Fix mce_log
The attempt to fixup the lockless mce log buffer introduced an infinite loop
when trying to find a free entry.

And:

Using rcu_dereference() to load mcelog.next doesn't seem to be sufficient
enough to ensure that mcelog.next is loaded each time around the loop in
mce_log().  Instead, use an explicit rmb() to ensure that the compiler gets it
right.

AK: turned the smp_wmbs into true wmbs to make sure they are not
reordered by the compiler on UP.

Signed-off-by: Mike Waychison <mikew@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-29 15:41:42 -07:00
Andi Kleen 7d318d7747 [PATCH] Fix up TLB flush filter disabling
I checked with AMD and they requested to only disable it for family 15.
Also disable it for i386 too. And some style fixes.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-29 15:41:42 -07:00
John W. Linville 6c654b5fdf [PATCH] swiotlb: move from arch/ia64/lib/ to lib/
The swiotlb implementation is shared by both IA-64 and EM64T. However,
the source itself lives under arch/ia64. This patch moves swiotlb.c
from arch/ia64/lib to lib/ and fixes-up the appropriate Makefile and
Kconfig files. No actual changes are made to swiotlb.c.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-29 14:42:42 -07:00
john stultz 6c132b5fe6 [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
This should resolve the issue seen in bugme bug #5105, where it is assumed
that dualcore x86_64 systems have synced TSCs.  This is not the case, and
alternate timesources should be used instead.

For more details, see:
http://bugzilla.kernel.org/show_bug.cgi?id=5105

Andi's earlier concerns that the TSCs should be synced on dualcore systems
have been resolved by confirmation from AMD folks that they can be
unsynced.

Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28 07:46:41 -07:00
Randy Dunlap 89d7cbf73e [PATCH] update URL for HPET spec.
Correct URL for HPET spec.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-21 10:12:18 -07:00
Linus Torvalds bc5e8fdfc6 x86-64/smp: fix random SIGSEGV issues
They seem to have been due to AMD errata 63/122; the fix is to disable
TLB flush filtering in SMP configurations.

Confirmed to fix the problem by Andrew Walrond <andrew@walrond.org>

[ Let's see if we'll have a better fix eventually, this is the Q&D
  "let's get this fixed and out there" version ]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-17 15:41:04 -07:00
Andrew Morton b9491ac835 [PATCH] x86_64: e820.c needs module.h
For EXPORT_SYMBOL.

Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-17 11:50:00 -07:00
David S. Miller 4db2ce0199 [LIB]: Consolidate _atomic_dec_and_lock()
Several implementations were essentialy a common piece of C code using
the cmpxchg() macro.  Put the implementation in one spot that everyone
can share, and convert sparc64 over to using this.

Alpha is the lone arch-specific implementation, which codes up a
special fast path for the common case in order to avoid GP reloading
which a pure C version would require.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14 21:47:01 -07:00
Linus Torvalds 1619cca292 Partially revert "Fix time going twice as fast problem on ATI Xpress chipsets"
Commit 66759a01ad introduced the fix for
time ticking too fast on some boards by disabling one of the doubly
connected timer pins on ATI boards.

However, it ends up being _much_ too broad a brush, and that just makes
some other ATI boards not work at all since they now have no timer
source.

So disable the automatic ATI southbridge detection, and just rely on
people who see this problem disabling it by hand with the option
"disable_timer_pin_1" on the kernel command line.

Maybe somebody can figure out the proper tests at a later date.

Acked-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-14 15:56:27 -07:00
Hugh Dickins 2fd4ef85e0 [PATCH] error path in setup_arg_pages() misses vm_unacct_memory()
Pavel Emelianov and Kirill Korotaev observe that fs and arch users of
security_vm_enough_memory tend to forget to vm_unacct_memory when a
failure occurs further down (typically in setup_arg_pages variants).

These are all users of insert_vm_struct, and that reservation will only
be unaccounted on exit if the vma is marked VM_ACCOUNT: which in some
cases it is (hidden inside VM_STACK_FLAGS) and in some cases it isn't.

So x86_64 32-bit and ppc64 vDSO ELFs have been leaking memory into
Committed_AS each time they're run.  But don't add VM_ACCOUNT to them,
it's inappropriate to reserve against the very unlikely case that gdb
be used to COW a vDSO page - we ought to do something about that in
do_wp_page, but there are yet other inconsistencies to be resolved.

The safe and economical way to fix this is to let insert_vm_struct do
the security_vm_enough_memory check when it finds VM_ACCOUNT is set.

And the MIPS irix_brk has been calling security_vm_enough_memory before
calling do_brk which repeats it, doubly accounting and so also leaking.
Remove that, and all the fs and arch calls to security_vm_enough_memory:
give it a less misleading name later on.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-Off-By: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-14 11:18:13 -07:00
Andi Kleen f3591fff04 [PATCH] x86_64: Export end_pfn
Fixes

> if [ -r System.map -a -x /sbin/depmod ]; then /sbin/depmod -ae -F
> System.map  2. 6.14-rc1; fi
> WARNING: /lib/modules/2.6.14-rc1/kernel/drivers/char/agp/amd64-agp.ko
> needs unknown symbol end_pfn

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 09:59:04 -07:00
Jan Beulich 42ac8ff2ce [PATCH] x86_64: NMI watchdog frequency calculation adjustments
Like previously done for i386, get the x86_64 watchdog tick calculation
into a state where it can also be used on CPUs with frequencies beyond
4GHz.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 08:22:33 -07:00
Randy Dunlap 9f1583339a [PATCH] use add_taint() for setting tainted bit flags
Use the add_taint() interface for setting tainted bit flags instead of
doing it manually.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 08:22:29 -07:00
Andi Kleen 4b6a455c74 [PATCH] x86-64: Use correct mask to compute conflicting nodes in SRAT
The nodes are not set online yet at this point.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:58 -07:00
Andi Kleen 2bce2b54ae [PATCH] x86-64: reset apicid<->node tables when SRAT cannot be parsed
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:58 -07:00
Andi Kleen e58e0d0312 [PATCH] x86-64: Clean up the SRAT node list before computing the hash function
Also use for_each_node_mask instead of hand crafted loops.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:58 -07:00
Chuck Ebbert 66759a01ad [PATCH] x86-64: i386/x86-64: Fix time going twice as fast problem on ATI Xpress chipsets
Original patch from Bertro Simul

This is probably still not quite correct, but seems to be
the best solution so far.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:58 -07:00
Jan Beulich 049cdefe19 [PATCH] x86-64: reduce x86-64 bug frame by 4 bytes
As mentioned before, the size of the bug frame can be further reduced while
continuing to use instructions to encode the information.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:58 -07:00
Al Viro 9cdd304b20 [PATCH] x86-64: more gratitious linux/irq.h includes
... and with that all instances in arch/x86_64 are gone.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:58 -07:00
Chuck Ebbert ff347b2215 [PATCH] x86-64: Fix incorrect FP signals
This is the same patch that went into i386 just before 2.6.13
came out.  I still can't build 64-bit user apps, so I tested
with program (see below) in 32-bit mode on 64-bit kernel:

Before:

	$ fpsig
	handler: nr = 8, si = 0x0804bc90, vuc = 0x0804bd10
	handler: altstack is at 0x0804b000, ebp = 0x0804bc7c
	handler: si_signo = 8, si_errno = 0, si_code = 0 [unknown]
	handler: fpu cwd = 0xb40, fpu swd = 0xbaa0
	handler: i387 unmasked precision exception, rounded up

After:

	$ fpsig
	handler: nr = 8, si = 0x0804bc90, vuc = 0x0804bd10
	handler: altstack is at 0x0804b000, ebp = 0x0804bc7c
	handler: si_signo = 8, si_errno = 0, si_code = 6 [inexact result]
	handler: fpu cwd = 0xb40, fpu swd = 0xbaa0
	handler: i387 unmasked precision exception, rounded up

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:58 -07:00
Chuck Ebbert 847815760c [PATCH] x86-64: Clean up nmi error message
The x86_64 nmi code is missing a newline in one of its messages.

I added a space before the CPU id for readability and killed the trailing
space on the previous line as well.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:58 -07:00
Andi Kleen a2a0c992e9 [PATCH] x86-64: Remove unused vxtime.hz field
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:57 -07:00
Andi Kleen a0d58c9741 [PATCH] x86-64: Set the stack pointer correctly in init_thread and init_tss
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:57 -07:00
Jan Beulich 1209140c3c [PATCH] x86-64: Safe interrupts in oops_begin/end
Rather than blindly re-enabling interrupts in oops_end(), save their state
in oope_begin() and then restore that state.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:57 -07:00
Andi Kleen 059bf0f6c3 [PATCH] x86-64: Merge msr.c with i386 version
The only difference was the inline assembly, so move that into
asm/msr.h and merge with the i386 version.

This adds some missing sysfs support code to x86-64.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:57 -07:00
Al Viro 55679edb19 [PATCH] x86-64: Clean up includes in arch/x86_64/kernel/suspend.c
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:57 -07:00
Jan Beulich 7effaa882a [PATCH] x86-64: Fix CFI information
Being the foundation for reliable stack unwinding, this fixes CFI unwind
annotations in many low-level x86_64 routines, plus a config option
(available to all architectures, and also present in the previously sent
patch adding such annotations to i386 code) to enable them separatly
rather than only along with adding full debug information.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:56 -07:00
Andi Kleen b3ab838224 [PATCH] x86-64: Fix gcc 4 warnings about pointer signedness
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:56 -07:00
Andi Kleen 5bf97e0119 [PATCH] x86-64: Use physflat on Intel for < 8 CPUs with CPU hotplug
This avoids races with the APIC broadcast/mask modes.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:56 -07:00
Andi Kleen 05d1fa4bf6 [PATCH] x86-64: Improve error handling for overlapping PXMs in SRAT.
- Report PXMs instead of nodes
- Report the correct PXM, not always the one of node 1.
- Only warn for the case of a PXM overlapping by itself

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:56 -07:00
Andi Kleen 2e8ad43ec0 [PATCH] x86-64: Prevent gcc 4 from optimizing away vsyscalls
They were previously static.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:56 -07:00
Ashok Raj c1a71a1ede [PATCH] x86-64: Delivery mode should be APIC_DM_FIXED when using physical mode.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:56 -07:00
Andi Kleen 47e5701e37 [PATCH] x86-64: Remove freeing of SMP trampoline pages
Nick points out it never worked because PageReserved was
set and it might cause problems later on. Also HOTPLUG_CPU
is much more common now so let's care not too much
about the !hotplug case.

Cc: nickpiggin@yahoo.com.au

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:56 -07:00
Andi Kleen 016102dea8 [PATCH] x86-64: Fix typo CONFIG_CPU_HOTPLUG -> CONFIG_HOTPLUG_CPU in genapic.c
Noted by Ashok Raj

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:56 -07:00
Alexander Nyberg 24dead8ac9 [PATCH] Remove unnecessary BUG_ON in irq.c
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:55 -07:00
Andi Kleen e92343cc8e [PATCH] x86-64: Fix show_mem a little bit
- Add KERN_INFO to printks (from i386)
- Use longs instead of ints to accumulate pages.
- Fix broken indenting.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:55 -07:00
Andi Kleen 083044e63b [PATCH] x86-64: Remove disable_tsc code in context switch
It only offers extremly dubious security advantages and
is not worth the overhead in this critical path.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:55 -07:00
Andi Kleen fe5d5f073e [PATCH] x86-64: Print version at end of kernel build
(from i386)

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:55 -07:00
Andi Kleen 48496e3495 [PATCH] x86-64: Fix (harmless) typo in head.S early level2 page table
The global bit  was not set in the first 2MB page, instead
it had a bit in the free AVL section which is useless.
Fixed thus.

Noticed by Eric Biederman

Cc:  Eric W. Biederman <ebiederm@xmission.com>

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:55 -07:00
Hugh Dickins b8f68e9ffa [PATCH] x86-64: Fix idle=poll
x86_64 idle=poll might be a little less responsive than it should: unlike
mwait_idle, and unlike i386, its poll_idle left TIF_POLLING_NRFLAG set.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:55 -07:00
Andi Kleen e99b861a3e [PATCH] x86-64: Only allocate per cpu data for possible CPUs, not compiled in CPUs.
Saves some memory except for hotplug situations.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:58 -07:00
Andi Kleen 3f74478b5f [PATCH] x86-64: Some cleanup and optimization to the processor data area.
- Remove unused irqrsp field
- Remove pda->me
- Optimize set_softirq_pending slightly

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:58 -07:00
Andi Kleen 459192c92c [PATCH] x86-64: Add simnow console
This adds console and earlyprintk support for a host file
on AMD's SimNow simulator.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:58 -07:00
Andi Kleen e5bc8b6baf [PATCH] x86-64: Make remote TLB flush more scalable
Instead of using a global spinlock to protect the state
of the remote TLB flush use a lock and state for each sending CPU.

To tell the receiver where to look for the state use 8 different
call vectors.  Each CPU uses a specific vector to trigger flushes on other
CPUs. Depending on the received vector the target CPUs look into
the right per cpu variable for the flush data.

When the system has more than 8 CPUs they are hashed to the 8 available
vectors. The limited global vector space forces us to this right now.
In future when interrupts are split into per CPU domains this could be
fixed, at the cost of needing more IPIs in flat mode.

Also some minor cleanup in the smp flush code and remove some outdated
debug code.

Requires patch to move cpu_possible_map setup earlier.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:58 -07:00
Tsuneo.Yoshioka@f-secure.com 83b942bd34 [PATCH] x86-64: Fix 32bit sendfile
If we use 64bit kernel on ia64/x86_64/s390 architecture, and we run
32bit binary on 32bit compatibility mode, sendfile system call seems be
not set offset argument.

This is because sendfile's return value is not zero but the code regards
the result by return value is zero or not.

This problem will be affect to ia64/x86_64/s390 and not affect to other
architecture does not affect other architecture (mips/parisc/ppc64/sparc64).

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00
Andi Kleen 9acf23c42b [PATCH] x86-64: Include build number in oops output
Include build number in oops output

Helps me to match oopses to correct kernel.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00
Andi Kleen 69e1a33f62 [PATCH] x86-64: Use ACPI PXM to parse PCI<->node assignments
Since this is shared code I had to implement it for i386 too

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00
Andi Kleen 413588c7cb [PATCH] x86-64: Remove code to resume machine check state of other CPUs.
The resume code uses CPU hotplug now so at resume time
we only ever see one CPU.

Pointed out by Yu Luming.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00
Andi Kleen b9aac10ddd [PATCH] x86-64: Remove redundant max_mapnr and replace with end_pfn
The FLATMEM people added it, but there doesn't seem a good reason
because end_pfn is identical.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00
Andi Kleen 0a43e4bf74 [PATCH] x86-64: Use e820_find_hole to compute reserved pages
Avoids a very dumb loop

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00
Andi Kleen 7c7a3897f6 [PATCH] x86-64: Fix harmless off by one in e820 code
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00
Andi Kleen 117090b5e8 [PATCH] x86-64: Micro optimization to dma_alloc_coherent node lookup
Use pcibus_to_node directly

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00
Andi Kleen 1d3fbbf9fe [PATCH] x86-64: Don't trust boot_cpu_id in the mptable.
It could be wrong for kexec or other cases. Read it from
the CPU instead.

Signed-off-by: Murali <muralim@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00
Andi Kleen 8c566ef5f3 [PATCH] x86-64: Add command line option to set machine check tolerance level
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00