Commit graph

590 commits

Author SHA1 Message Date
Linus Torvalds
cbf0ec6ee0 Revert "[PATCH] x86-64: Fix up handling of non canonical user RIPs"
This reverts commit c33d4568ac.

Andrew Clayton and Hugh Dickins report that it's broken for them and
causes strange page table and slab corruption, and spontaneous reboots.

Let's get it right next time.

Cc: Andrew Clayton <andrew@rootshell.co.uk>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-14 08:01:47 -08:00
Andi Kleen
c33d4568ac [PATCH] x86-64: Fix up handling of non canonical user RIPs
EM64T CPUs have somewhat weird error reporting for non canonical RIPs in
SYSRET.

We can't handle any exceptions there because the exception handler would
end up running on the user stack which is unsafe.

To avoid problems any code that might end up with a user touched pt_regs
should return using int_ret_from_syscall.  int_ret_from_syscall ends up
using IRET, which allows safe exceptions.

Cc: Ernie Petrides <petrides@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-12 22:56:29 -08:00
Michael Matz
2ec5e3a867 [PATCH] fix kexec asm
While testing kexec and kdump we hit problems where the new kernel would
freeze or instantly reboot.  The easiest way to trigger it was to kexec a
kernel compiled for CONFIG_M586 on an athlon cpu.  Compiling for CONFIG_MK7
instead would work fine.

The patch fixes a few problems with the kexec inline asm.

Signed-off-by: Chris Mason <mason@suse.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-08 14:15:04 -08:00
Linus Torvalds
637029c6cb Revert "[PATCH] x86_64: Only do the clustered systems have unsynchronized TSC assumption on IBM systems"
This reverts commit 13a229abc2.

Quoth Andi:
  "After some consideration and feedback from various people it turns
   out this wasn't that good an idea.  It has some problems and needs
   more work.  Since it was only an optimization anyways it's best to
   just back it out again for now."

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-27 20:41:56 -08:00
Linus Torvalds
add2b6fdae Make Kprobes depend on modules
Commit 9ec4b1f356 made kprobes not compile
without module support, so just make that clear in the Kconfig file.

Also, since it's marked EXPERIMENTAL, make that dependency explicit too.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 20:24:40 -08:00
Brian Magnuson
d51761233d [PATCH] fix build on x86_64 with !CONFIG_HOTPLUG_CPU
The commit e2c0388866 added
setup_additional_cpus to setup.c but this is only defined if
CONFIG_HOTPLUG_CPU is set.  This patch changes the #ifdef to reflect that.

Signed-off-by: Brian Magnuson <magnuson@rcn.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 19:07:42 -08:00
Andi Kleen
ab9b32ee62 [PATCH] x86_64: Better ATI timer fix
The previous experiment for using apicmaintimer on ATI systems didn't
work out very well.  In particular laptops with C2/C3 support often
don't let it tick during idle, which makes it useless.  There were also
some other bugs that made the apicmaintimer often not used at all.

I tried some other experiments - running timer over RTC and some other
things but they didn't really work well neither.

I rechecked the specs now and it turns out this simple change is
actually enough to avoid the double ticks on the ATI systems.  We just
turn off IRQ 0 in the 8254 and only route it directly using the IO-APIC.

I tested it on a few ATI systems and it worked there.  In fact it worked
on all chipsets (NVidia, Intel, AMD, ATI) I tried it on.

According to the ACPI spec routing should always work through the
IO-APIC so I think it's the correct thing to do anyways (and most of the
old gunk in check_timer should be thrown away for x86-64).

But for 2.6.16 it's best to do a fairly minimal change:
 - Use the known to be working everywhere-but-ATI IRQ0 both over 8254
   and IO-APIC setup everywhere
 - Except on ATI disable IRQ0 in the 8254
 - Remove the code to select apicmaintimer on ATI chipsets
 - Add some boot options to allow to override this (just paranoia)

In 2.6.17 I hope to switch the default over to this for everybody.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:31 -08:00
Andi Kleen
e8b917775b [PATCH] x86_64: Move the SMP time selection earlier
SMP time selection originally ran after all CPUs were brought up because
it needed to know the number of CPUs to decide if it needs an MP safe
timer or not.

This is not needed anymore because we know present CPUs early.

This fixes a couple of problems:
 - apicmaintimer didn't always work because it relied on state that was
   set up time_init_gtod too late.
 - The output for the used timer in early kernel log was misleading
   because time_init_gtod could actually change it later.  Now always
   print the final timer choice

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:31 -08:00
Andi Kleen
e2c0388866 [PATCH] x86_64: Fix the additional_cpus=.. option
It didn't set up the CPU possible map early enough, so the
option didn't actually work.

Noticed by Heiko Carstens

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:30 -08:00
Chris McDermott
1f99215392 [PATCH] x86_64: Fix NMI watchdog on x460
[description from AK]

Old check for the IO-APIC watchdog during the timer check was wrong -
it obviously should only drop into this if the IO-APIC watchdog is used.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:30 -08:00
Andi Kleen
e78256b8f3 [PATCH] x86-64/i386: Use common X86_PM_TIMER option and make it EMBEDDED
This makes x86-64 use the common X86_PM_TIMER Kconfig entry in drivers/acpi

And since PM timer is needed for correct timing on a lot of systems
now (e.g. AMD dual cores) and we often get bug reports from people
who forgot to set it make it depend on CONFIG_EMBEDDED. x86-64 had
this change before and it's a good thing.

I also fixed the description slightly to make this more clear.

Cc: len.brown@intel.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:30 -08:00
Andi Kleen
13a229abc2 [PATCH] x86_64: Only do the clustered systems have unsynchronized TSC assumption on IBM systems
Big Unisys systems have multiple clusters too, but they have an
synchronized TSC.

I'm using the SMBIOS to check for vendor == IBM.

Cc: Chris McDermott <lcm@us.ibm.com>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com>

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:30 -08:00
Jon Mason
60b08c6722 [PATCH] x86_64: no_iommu removal in pci-gart.c
In previous versions of pci-gart.c, no_iommu was used to determine if IOMMU was
disabled in the GART DMA mapping functions.  This changed in 2.6.16 and now
gart_xxx() functions are only called if gart is enabled.  Therefore, uses of
no_iommu in the GART code are no longer necessary and can be removed.

Also, it removes double deceleration of no_iommu and force_iommu in pci.h and
proto.h, by removing the deceleration in pci.h.

Lastly, end_pfn off by one error.

Tested (along with patch 1/2) on dual opteron with gart enabled, iommu=soft,
and iommu=off.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:29 -08:00
Dave Jones
a0124d780d [PATCH] x86-64: react to new topology.c location
Commit 9c869edac5 moved the i386 topology.c
file. That change broke x86-64 compiles, as it uses the same file.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-25 11:12:15 -08:00
Andi Kleen
2e2b426366 [PATCH] x86_64: Don't set CONFIG_DEBUG_INFO in defconfig
Undo setting of CONFIG_DEBUG_INFO in the previous defconfig update.  It
will make every build much slower and need more disk space and isn't a good
default.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-20 20:00:10 -08:00
Tim Hockin
9ff4ced467 [PATCH] Remove KERN_INFO from middle of printk line
Don't print KERN_INFO in the middle of a printk line.
	printk(KERN_INFO "OEM ID: %s ",str);
is just above this. This is already fixed up in i386 copy.

Signed-off-by: Martin J. Bligh <mbligh@google.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 13:59:27 -08:00
Andi Kleen
2aed711a39 [PATCH] x86_64: Always pass full number of nodes to NUMA hash computation
Previously the numa hash code would be confused by holes in the node space
and stop early. This is the first part of the fix for the non boot issue
with empty nodes on Opterons.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 08:00:41 -08:00
Andi Kleen
fdb9df9424 [PATCH] x86_64: Relax SRAT covers all memory check a bit
Code was refusing good SRATs because about 12K got lost somewhere.
Allow less than 1MB of difference before rejecting it.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 08:00:41 -08:00
Andi Kleen
6574ffd74b [PATCH] x86_64: Resolve the RIP of an early exception using kallsyms
But do it after everything else to risk less from recursive
crashes.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 08:00:40 -08:00
Andi Kleen
7fd67843b9 [PATCH] x86_64: Disable tsc when apicpmtimer is active
Otherwise it has no effect anyways.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 08:00:40 -08:00
Andi Kleen
ab68805955 [PATCH] x86_64: Don't enable ATI apicmaintimer workaround when the machine has C2 or C3
Many laptops have problems with ticking the local APIC timer in C2/C3.
The code added earlier to use it by default on ATI didn't really work
for them. Don't enable it when the system supports C2/C3.

This doesn't fix the problem fully, but at least it's not worse than before.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 08:00:40 -08:00
Andi Kleen
2391c4b594 [PATCH] x86_64: Don't call do_exit with interrupts disabled after IRET exception
This caused a sigreturn with bad argument on a preemptible kernel
to complain with

Debug: sleeping function called from invalid context at /home/lsrc/quilt/linux/include/linux/rwsem.h:43
in_atomic():0, irqs_disabled():1

Call Trace: {__might_sleep+190} {profile_task_exit+21}
       {__do_exit+34} {do_wait+0}

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 08:00:40 -08:00
Jan Beulich
99019e9199 [PATCH] x86_64: make touch_nmi_watchdog() not touch impossible cpus' private data
Along with that, also suppress the memory touching altogether when the
watchdog is not running, to eliminate needless crosstalk. Plus ad a call
to it to make things consistent (one could also consider removing the call
in enable_timer_nmi_watchdog()).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 08:00:40 -08:00
Andi Kleen
e4444d1a30 [PATCH] x86_64: Update defconfig
... and enable 1394 by default.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 08:00:40 -08:00
Daniel Yeisley
d1db4ec86c [PATCH] x86_64: early initialization of cpu_to_node
The early initialization of cpu_to_node code as it is now only updates the
cpu_to_node array, and does not update cpu_pda()->nodemember.  This will
cause numa_node_id() to return 0 on systems where CPU 0 is not on Node 0.
This leads to a kernel panic in slab.c.

I've tested the patch below on a 16 processor x86_64 ES7000-600 server, and
no longer see the panic I saw with the original 2.6.16-rc3.

Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-15 15:32:22 -08:00
Andi Kleen
0d541064e8 [PATCH] x86_64: GART DMA merging fix
Don't touch the non DMA members in the sg list in dma_map_sg in the IOMMU

Some drivers (in particular ST) ran into problems because they reused the sg
lists after passing them to pci_map_sg().  The merging procedure in the K8
GART IOMMU corrupted the state.  This patch changes it to only touch the dma*
entries during merging, but not the other fields.  Approach suggested by Dave
Miller.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-12 16:10:47 -08:00
John Blackwood
a65d17c9d2 [PATCH] arch/x86_64/kernel/traps.c PTRACE_SINGLESTEP oops
We found a problem with x86_64 kernels with preemption enabled, where
having multiple tasks doing ptrace singlesteps around the same time will
cause the system to 'oops'.  The problem seems that a task can get
preempted out of the do_debug() processing while it is running on the
DEBUG_STACK stack.  If another task on that same cpu then enters do_debug()
and uses the same per-cpu DEBUG_STACK stack, the previous preempted tasks's
stack contents can be corrupted, and the system will oops when the
preempted task is context switched back in again.

The typical oops looks like the following:

  Unable to handle kernel paging request at ffffffffffffffae RIP: <ffffffff805452a1>{thread_return+34}
  PGD 103027 PUD 102429067 PMD 0
  Oops: 0002 [1] PREEMPT SMP
  CPU 0
  Modules linked in:
  Pid: 3786, comm: ssdd Not tainted 2.6.15.2 #1
  RIP: 0010:[<ffffffff805452a1>] <ffffffff805452a1>{thread_return+34}
  RSP: 0018:ffffffff80824058  EFLAGS: 000136c2
  RAX: ffff81017e12cea0 RBX: 0000000000000000 RCX: 00000000c0000100
  RDX: 0000000000000000 RSI: ffff8100f7856e20 RDI: ffff81017e12cea0
  RBP: 0000000000000046 R08: ffff8100f68a6000 R09: 0000000000000000
  R10: 0000000000000000 R11: ffff81017e12cea0 R12: ffff81000c2d53e8
  R13: ffff81017f5b3be8 R14: ffff81000c0036e0 R15: 000001056cbfc899
  FS:  00002aaaaaad9b00(0000) GS:ffffffff80883800(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: ffffffffffffffae CR3: 00000000f6fcf000 CR4: 00000000000006e0
  Process ssdd (pid: 3786, threadinfo ffff8100f68a6000, task ffff8100f7856e20)
  Stack: ffffffff808240d8 ffffffff8012a84a ffff8100055f6c00 0000000000000020
         0000000000000001 ffff81000c0036e0 ffffffff808240b8 0000000000000000
         0000000000000000 0000000000000000
  Call Trace: <#DB>
	<ffffffff8012a84a>{try_to_wake_up+985}
	<ffffffff8012c0d3>{kick_process+87}
        <ffffffff8013b262>{signal_wake_up+48}
	<ffffffff8013b5ce>{specific_send_sig_info+179}
        <ffffffff80546abc>{_spin_unlock_irqrestore+27}
	<ffffffff8013b67c>{force_sig_info+159}
        <ffffffff801103a0>{do_debug+289} <ffffffff80110278>{sync_regs+103}
        <ffffffff8010ed9a>{paranoid_userspace+35}
  Unable to handle kernel paging request at 00007fffffb7d000 RIP: <ffffffff8010f2e4>{show_trace+465}
  PGD f6f25067 PUD f6fcc067 PMD f6957067 PTE 0
  Oops: 0000 [2] PREEMPT SMP

This patch disables preemptions for the task upon entry to do_debug(), before
interrupts are reenabled, and then disables preemption before exiting
do_debug(), after disabling interrupts.  I've noticed that the task can be
preempted either at the end of an interrupt, or on the call to
force_sig_info() on the spin_unlock_irqrestore() processing.  It might be
better to attempt to code a fix in entry.S around the code that calls
do_debug().

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-12 16:10:47 -08:00
Chris McDermott
33042a9ff4 [PATCH] x86-64: Fix HPET timer on x460
[description from AK]

The IBM Summit 3 chipset doesn't implement the HPET timer replacement
option.  Since the current Linux code relies on it use a mixed mode with
both PIT for the interrupt and HPET counters for the time keeping.  That
was already implemented, but didn't work properly because it was still
using the last interrupt offset in HPET.  This resulted in x460 not
booting.  Fix this up by using the free running HPET counter.

Shouldn't affect any other machine because they either use full HPET mode
or no HPET at all.

TBD needs a similar 32bit fix.

Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
Cc: Bob Picco <bob.picco@hp.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-11 21:41:11 -08:00
Ulrich Drepper
cff2b76009 [PATCH] fstatat64 support
The *at patches introduced fstatat and, due to inusfficient research, I
used the newfstat functions generally as the guideline.  The result is that
on 32-bit platforms we don't have all the information needed to implement
fstatat64.

This patch modifies the code to pass up 64-bit information if
__ARCH_WANT_STAT64 is defined.  I renamed the syscall entry point to make
this clear.  Other archs will continue to use the existing code.  On x86-64
the compat code is implemented using a new sys32_ function.  this is what
is done for the other stat syscalls as well.

This patch might break some other archs (those which define
__ARCH_WANT_STAT64 and which already wired up the syscall).  Yet others
might need changes to accomodate the compatibility mode.  I really don't
want to do that work because all this stat handling is a mess (more so in
glibc, but the kernel is also affected).  It should be done by the arch
maintainers.  I'll provide some stand-alone test shortly.  Those who are
eager could compile glibc and run 'make check' (no installation needed).

The patch below has been tested on x86 and x86-64.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-11 21:41:10 -08:00
Andi Kleen
4b88f09364 [PATCH] x86-64: Add sys_unshare
Add unshare syscall for x86-64

ppoll/pselect are not ready yet, but add reservations.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-08 15:52:15 -08:00
Al Viro
cc59853b4a [PATCH] arch/x86_64/pci/mmconfig.c NULL noise removal
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-07 20:59:01 -05:00
Al Viro
dd42b15186 [PATCH] amd64 time.c __iomem annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-07 20:58:45 -05:00
Al Viro
4fb7d9827e [PATCH] drive_info removal outside of arch/i386
drive_info is used only by hd.c and that happens under #ifdef __i386__.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-07 20:56:47 -05:00
Ravikiran G Thirumalai
488fc08d91 [PATCH] x86_64: Fix the node cpumask of a cpu going down
Currently, x86_64 and ia64 arches do not clear the corresponding bits in
the node's cpumask when a cpu goes down or cpu bring up is cancelled.  This
is buggy since there are pieces of common code where the cpumask is checked
in the cpu down code path to decide on things (like in the slab down path).
 PPC does the right thing, but x86_64 and ia64 don't (This was the reason
Sonny hit upon a slab bug during cpu offline on ppc and could not reproduce
on other arches).  This patch fixes it for x86_64.  I won't attempt ia64 as
I cannot test it.

Credit for spotting this should go to Alok.

(akpm: this was applied, then reverted.  But it's OK now because we now use
for_each_cpu() in the right places).

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-07 16:12:31 -08:00
Linus Torvalds
cef5076987 Revert "[PATCH] x86_64: Fix the node cpumask of a cpu going down"
This reverts commit 10f4dc8b27.

Quoth Andi Kleen:
  "Kiran decided that it makes the problem worse than it was before.
   Fixing it fully requires more work which is too much for 2.6.16.  So
   please revert that commit for now."

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 10:51:57 -08:00
Jon Mason
5b7b644ca9 [PATCH] x86_64: IOMMU printk cleanup
This patch contains a printk reorder to remove the current problem of
displaying "PCI-DMA: Disabling IOMMU." and then "PCI-DMA: using GART
IOMMU" 20 lines later in dmesg.

It also constains a printk reorder in swiotlb to state swiotlb
enablement prior to describing the location of the bounce buffers, and a
printk reorder to state gart enablement prior to describing the
aperature.

Also constains a whitespace cleanup in arch/x86_64/kernel/setup.c

Tested (along with patch 2/2) on dual opteron with gart enabled,
iommu=soft, and iommu=off.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:15 -08:00
Andi Kleen
14c3f85587 [PATCH] x86_64: Let impossible CPUs point to reference per cpu data
Hack for 2.6.16. In 2.6.17 all code that uses NR_CPUs should
be audited and changed to only touch possible CPUs.

Don't mark the reference per cpu data init data (so it stays
around after boot) and point all impossible CPUs to it. This way
they reference some valid - although shared memory. Usually
this is only initialization like INIT_LIST_HEADs and there
won't be races because these CPUs never run. Still somewhat hackish.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:15 -08:00
Andi Kleen
3777a95903 [PATCH] i386/x86-64: Don't ack the APIC for bad interrupts when the APIC is not enabled
It's bad juju to touch the APIC when it hasn't been enabled.
I also moved ack_bad_irq for x86-64 out of line following i386.

Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:15 -08:00
Jan Beulich
d646bce4c7 [PATCH] x86_64: minor odering correction to dump_pagetable()
Checking of the validity of pointers should be consistently done before
dereferencing the pointer.

Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:15 -08:00
Jan Beulich
91522a964b [PATCH] x86_64: small fix for CFI annotations
Conditionalize two unwind directives to match other similarly
conditional code.

Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Cc: Jim Houston <jim.houston@ccur.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:15 -08:00
Andi Kleen
0c3749c41f [PATCH] x86_64: Calibrate APIC timer using PM timer
On some broken motherboards (at least one NForce3 based AMD64 laptop)
the PIT timer runs at a incorrect frequency.  This patch adds a new
option "apicpmtimer" that allows to use the APIC timer and calibrate it
using the PMTimer.  It requires the earlier patch that allows to run the
main timer from the APIC.

Specifying apicpmtimer implies apicmaintimer.

The option defaults to off for now.

I tested it on a few systems and the resulting APIC timer frequencies
were usually a bit off, but always <1%, which should be tolerable.

TBD figure out heuristic to enable this automatically on the affected
systems TBD perhaps do it on all NForce3s or using DMI?

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:15 -08:00
Andi Kleen
099f318b8d [PATCH] x86_64: Don't allow kprobes on __switch_to
kprobes cannot deal with the funny calling conventions when it
runs on a different stack when it returns. If someone wants
to instrument context switch they can add a probe to schedule()
instead.

Cc: jkenisto@us.ibm.com, prasanna@in.ibm.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:14 -08:00
Zach Brown
16acc0cd8f [PATCH] x86_64: align per-cpu section to configured cache bytes
Align the start of the per-cpu section to the configured number of bytes in a
cache line.  This stops a BUG_ON() from triggering in load_module() when
DEFINE_PER_CPU() is used in a module and the section isn't cacheline-aligned.
Rusty also found this and sent a patch in a while ago
(http://lkml.org/lkml/2004/10/19/17), I don't know what came of that.

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:14 -08:00
Kevin VanMaren
a1002a48e1 [PATCH] x86_64: When allocation of merged SG lists fails in the IOMMU don't merge
[ AK: I redid Kevin's fix to be simpler, but the idea and original
  analysis of the problem is from Kevin]

This avoid allocation failures on some SATA systems like Nvidia CK8
when the IOMMU gets fragmented. Modern SATA devices have quite large queues
(128 entries) and the FS with ext2/3 is good enough now that it often
passes whole 128 page sg lists down to the driver. These require
512K of continuous free space in the IOMMU aperture to map when merged.
When the IOMMU is fragmented this could lead to spurious IO errors
due to failing mappings.

Short term fix is to just try to map the SG list again unmerged
page by page - this way fragmentation doesn't matter anymore.
The code for that was already there, but it just wasn't enabled for the
merge case.

According to Kevin at least the Nvidia device doesn't seem to benefit
from merging much anyways, so the only slowdown is from trying
to do an unnecessary merge attempt.

Kevin plans to implement better fragmentation avoidance in the future,
but that wouldn't be 2.6.16 material.

TBD: should add some statistic counters to count how often that really
happens.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:14 -08:00
Andi Kleen
1de6bf33bc [PATCH] x86_64: Fix zero mcfg entry workaround on x86-64
I broke this earlier when moving the patch from i386 to x86-64.
Need to return the virtual address here, not the physical address.
This fixes some boot time crashes on x86-64.

Cc: gregkh@suse.de

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:14 -08:00
Andi Kleen
d22fe80844 [PATCH] x86_64: Do more checking in the SRAT header code
- Check if the processor/memory affinity entries are long enough
   according to the ACPI 3.0 spec.
 - Ignore memory affinity entries that define a zero length region.

All based on BIOS issues found in the field @)

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:14 -08:00
Ashok Raj
7ded56895c [PATCH] x86_64: data/functions wrongly marked as __init with cpu hotplug.
attached patch is 2 more cases i found via running the reference_init.pl
script. These were easy to spot just knowing the file names. There is
one another about init/main.c that i cant exactly zero in. (partly
because i dont know how to interpret the data thats spewed out of the tool).

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:14 -08:00
Shaohua Li
396bd50fed [PATCH] x86_64: mark two routines as __cpuinit
SIgned-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:14 -08:00
Andi Kleen
9391a3f9c7 [PATCH] x86_64: Clear more state when ignoring empty node in SRAT parsing
Might fix boot failures on systems with empty PXMs in SRAT

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:14 -08:00
Andi Kleen
7115125057 [PATCH] x86_64: Remove CONFIG_INIT_DEBUG
It has been enabled by default for some time now and is cheap enough
so it doesn't matter anyways.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04 16:43:13 -08:00