Commit Graph

1176 Commits (a38a44c1a93078fc5fadc4ac2df8dea4697069e2)

Author SHA1 Message Date
Andi Kleen 9ddab42d1e [PATCH] Remove outdated comment in x86-64 mmconfig code
Cc: gregkh@suse.de
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:41 +02:00
Andi Kleen 27fbe5b28a [PATCH] Use string instructions for Core2 copy/clear
It is faster than using a unrolled loop for the use cases the kernel
cares about (cached, sizes typically < 4K)

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:41 +02:00
Matthew Garrett 35d534a3ff [PATCH] x86: - restore i8259A eoi status on resume
Got it. i8259A_resume calls init_8259A(0) unconditionally, even if
auto_eoi has been set. Keep track of the current status and restore that
on resume. This fixes it for AMD64 and i386.

Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:41 +02:00
Aaron Durbin 4c6e052adf [PATCH] MMCONFIG and new Intel motherboards
On Sat, Sep 09, 2006 at 04:14:29PM +0200, Andi Kleen wrote:
> [patch] Looks reasonable, but probably not for 2.6.18 because this stuff
> is already too fragile and it is probably too risky to do any big changes now
> since not enough testing time is left. Can you please resubmit
> it with proper description and signed-off-by line? I can queue it for .19 then
>
> -Andi

Patch inserts PCI memory mapped config region(s) into the resource map.  This
will allow for the MMCCONFIG regions to be marked as busy in the iomem
address space as well as the regions(s) showing up in /proc/iomem.

Signed-off-by: Aaron Durbin <adurbin@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:40 +02:00
Aaron Durbin 56dd669a13 [PATCH] Insert GART region into resource map
Patch inserts the GART region into the iomem resource map. The GART will then
be visible within /proc/iomem. It will also allow for other users
utilizing the GART to subreserve the region (agp or IOMMU).

Signed-off-by: Aaron Durbin <adurbin@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:40 +02:00
Andi Kleen 9abd79280b [PATCH] i386/x86-64: Only do MCFG e820 check when type 1 works
Needs earlier patch to split type 1 probing from use.

This patch should fix the x86 macs where type 1 PCI config space access
doesn't work, but MCFG does. They also don't have a usable e820 table
so the e820 sanity check failed.

Instead assume now that if type 1 doesn't work then MCFG must work
and don't do the e820 check.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:40 +02:00
Andi Kleen 5e544d618f [PATCH] i386/x86-64: PCI: split probing and initialization of type 1 config space access
First probe if type1/2 accesses work, but then only initialize them at the end.

This is useful for a later patch that needs this information inbetween.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:40 +02:00
Andi Kleen a15da49deb [PATCH] Fix idle notifiers
Previously exit_idle would be called more often than enter_idle

Now instead of using complicated tests just keep track of it
using the per CPU variable as a flip flop.  I moved the idle state into the
PDA to make the access more efficient.

Original bug report and an initial patch from Stephane Eranian,
but redone by AK.

Cc: Stephane Eranian <eranian@hpl.hp.com>

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:40 +02:00
Eric W. Biederman 1c9c0a6ca3 [PATCH] Remove experimental mark of kexec
kexec has been marked experimental for a year now and all
of the serious problems have been worked through.  So it
is time (if not past time) to remove the experimental mark.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:40 +02:00
Andi Kleen b38337a624 [PATCH] Mark per cpu data initialization __initdata again
Before 2.6.16 this was changed to work around code that accessed
CPUs not in the possible map. But that code should be all fixed now,
so mark it __initdata again.
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:40 +02:00
Andi Kleen 26c13f2b5b [PATCH] Check return values of __copy_to_user in uname emulation
Quietens some new warnings
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:40 +02:00
Andi Kleen 95912008ba [PATCH] Add __must_check to copy_*_user
Following i386.

And also fix the two occurrences that caused warnings in arch/x86_64/*

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:39 +02:00
Andi Kleen 3022d734a5 [PATCH] Fix zeroing on exception in copy_*_user
- Don't zero for __copy_from_user_inatomic following i386.
This will prevent spurious zeros for parallel file system writers when
one does a exception
- The string instruction version didn't zero the output on
exception. Oops.

Also I cleaned up the code a bit while I was at it and added a minor
optimization to the string instruction path.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:39 +02:00
adurbin@google.com 54dbc0c9eb [PATCH] insert IOAPIC(s) and Local APIC into resource map
This patch places the IOAPIC(s) and the Local APIC specified by ACPI
tables into the resource map. The APICs will then be visible within
/proc/iomem

Signed-off-by: Aaron Durbin <adurbin@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:39 +02:00
Andi Kleen 7b0bda74f7 [PATCH] Fix a PDA warning uncovered by the new type checking
Fix
linux/arch/x86_64/kernel/process.c: In function __switch_to:
linux/arch/x86_64/kernel/process.c:626: warning: assignment makes integer from pointer without a cast

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:39 +02:00
Andi Kleen 96e540492a [PATCH] Fix a irqcount comment in entry.S
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:39 +02:00
Arjan van de Ven 4f7fd4d7a7 [PATCH] Add the -fstack-protector option to the CFLAGS
Add a feature check that checks that the gcc compiler has stack-protector
support and has the bugfix for PR28281 to make this work in kernel mode.
The easiest solution I could find was to have a shell script in scripts/
to do the detection; if needed we can make this fancier in the future
without making the makefile too complex.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
CC: Andi Kleen <ak@suse.de>
CC: Sam Ravnborg <sam@ravnborg.org>
2006-09-26 10:52:39 +02:00
Arjan van de Ven 0a42540580 [PATCH] Add the canary field to the PDA area and the task struct
This patch adds the per thread cookie field to the task struct and the PDA.
Also it makes sure that the PDA value gets the new cookie value at context
switch, and that a new task gets a new cookie at task creation time.

Signed-off-by: Arjan van Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
CC: Andi Kleen <ak@suse.de>
2006-09-26 10:52:38 +02:00
Arjan van de Ven b62a5c740d [PATCH] Add the Kconfig option for the stackprotector feature
This patch adds the config options for -fstack-protector.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
CC: Andi Kleen <ak@suse.de>
2006-09-26 10:52:38 +02:00
Andi Kleen e8c7391de4 [PATCH] Don't use kernel_text_address in oops context
Because it can take spinlocks.

Suggested by Mathieu Desnoyers

Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:38 +02:00
Magnus Damm 4bfaaef01a [PATCH] Avoid overwriting the current pgd (V4, x86_64)
kexec: Avoid overwriting the current pgd (V4, x86_64)

This patch upgrades the x86_64-specific kexec code to avoid overwriting the
current pgd. Overwriting the current pgd is bad when CONFIG_CRASH_DUMP is used
to start a secondary kernel that dumps the memory of the previous kernel.

The code introduces a new set of page tables. These tables are used to provide
an executable identity mapping without overwriting the current pgd.

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:38 +02:00
Keith Owens f574164491 [PATCH] Remove most of the special cases for the debug IST stack
Remove most of the special cases for the debug IST stack.  This is a
follow on clean up patch, it requires the bug fix patch that adds
orig_ist.

Signed-off-by: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:38 +02:00
Andi Kleen 53ee11ae0d [PATCH] Optimize PDA accesses slightly
Based on a idea by Jeremy Fitzhardinge:

Replace the volatiles and memory clobbers in the PDA access with
telling gcc about access to a proxy PDA structure that doesn't
actually exist. But the dummy accesses give a defined ordering for
read/write accesses.

Also add some memory barriers to the early GS initialization to
make sure no PDA access is moved before it.

Advantage is some .text savings (probably most from better
code for accessing "current"):

   text    data     bss     dec     hex filename
4845647 1223688  615864 6685199  66020f vmlinux
4837780 1223688  615864 6677332  65e354 vmlinux-pda

1.2% smaller code

Cc:  Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:38 +02:00
Ian Campbell f2a9e1dec2 [PATCH] Put .note.* sections into a PT_NOTE segment
This patch updates x86_64 linker script to pack any .note.* sections
into a PT_NOTE segment in the output file.

To do this, we tell ld that we need a PT_NOTE segment.  This requires
us to start explicitly mapping sections to segments, so we also need
to explicitly create PT_LOAD segments for text and data, and map the
sections to them appropriately.  Fortunately, each section will
default to its previous section's segment, so it doesn't take many
changes to vmlinux.lds.S.

The corresponding change is already made for i386 in -mm and I'd like
this patch to join it. The section to segment mappings do change as do
the segment flags so some time in -mm would be good for that reason as
well, just in case.

In particular .data and .bss move from the text segment to the data
segment and .data.cacheline_aligned .data.read_mostly are put in the
data segment instead of a separate one.

I think that it would be possible to exactly match the existing section
to segment mapping and flags but it would be a more intrusive change and
I'm not sure there is a reason for the existing layout other than it is
what you get by default if you don't explicitly specify something else.
If there is a reason for the existing layout then I will of course make
the more intrusive change. If there is no reason we could probably drop
the executable or writable flags from some segments but I don't know how
much attention is paid to them anyway so it might not be worth the
effort.

The vsyscall related sections need to go in a different segment to the
normal data segment and so I invented a "user" segment to contain them.
I believe this should appear to be another data segment as far as the
kernel is concerned so the flags are setup accordingly.

The notes will be used in the Xen paravirt_ops backend to provide
additional information to the domain builder. I am in the process of
converting the xen-unstable kernels and tools over to this scheme at the
moment to support this in the future.

It has been suggested to me that the notes segment should have flags 0
(i.e. not readable) since it is only used by the loader and is not used
at runtime. For now I went with a readable segment since that is what
the i386 patch uses.

AK: dropped NOTES addition right now because the needed infrastructure
for that is not merged yet

Signed-off-by: Ian Campbell <ian.campbell@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:38 +02:00
Eric W. Biederman 26374c7b7d [PATCH] Reload CS when startup_64 is used.
In long mode the %cs is largely a relic.  However there are a few cases
like iret where it matters that we have a valid value.  Without this
patch it is possible to enter the kernel in startup_64 without setting
%cs to a valid value.  With this patch we don't care what %cs value
we enter the kernel with, so long as the cs shadow register indicates
it is a privileged code segment.

Thanks to Magnus Damm for finding this problem and posting the
first workable patch.  I have moved the jump to set %cs down a
few instructions so we don't need to take an extra jump.  Which
keeps the code simpler.

Signed-of-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:38 +02:00
Andi Kleen 8380aabb99 [PATCH] Remove non e820 fallbacks in high level code
Drop support for non e820 BIOS calls to get the memory map.

The boot assembler code still has some support, but not the C code now.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Paolo 'Blaisorblade' Giarrusso b3698c03eb [PATCH] Fix boot code head.S warning
When compiling a 64-bit kernel on an Ubuntu 6.06 32bit system (whose GCC is also
a cross-compiler for x86_64) I've seen that head.o is compiled as a 64-bit file
(while it should not) and ld complaining about this during linking:
[AK: it happens on all systems with new binutils]

ld: warning: i386:x86-64 architecture of input file
`arch/x86_64/boot/compressed/head.o' is incompatible with i386 output

I've verified that removing -m64 from compilation flags to turn
"-m64 -traditional -m32" into "-traditional -m32" fixes the issue.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen 7a0a2dff1c [PATCH] Add a missing check for irq flags tracing in NMI
NMIs are not supposed to track the irq flags, but TRACE_IRQS_IRETQ
did it anyways. Add a check.

Cc: mingo@elte.hu

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen aecc63615e [PATCH] Fix coding style and output of the mptable parser
Give the printks a consistent prefix.
Add some missing white space.

Cc: len.brown@intel.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen e4251e130d [PATCH] Remove some cruft in apic id checking during processor setup
- Remove a define that was used only once
- Remove the too large APIC ID check because we always support
the full 8bit range of APICs.
- Restructure code a bit to be simpler.

Cc: len.brown@intel.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen f2c2cca3ac [PATCH] Remove APIC version/cpu capability mpparse checking/printing
ACPI went to great trouble to get the APIC version and CPU capabilities
of different CPUs before passing them to the mpparser. But all
that data was used was to print it out.  Actually it even faked some data
based on the boot cpu, not on the actual CPU being booted.

Remove all this code because it's not needed.

Cc: len.brown@intel.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen 5e6b0bfe5b [PATCH] Use proper accessors to change PSE bits in change_page_attr()
Use normal pte accessors in change_page_attr() to access the PSE
bits.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen df992848f5 [PATCH] Fix pte_exec/mkexec and use it in change_page_attr()
Fix the pte_exec/mkexec page table accessor functions to really
use the NX bit. Previously they only checked the USER bit, but
weren't actually used for anything.

Then use them in change_page_attr() to manipulate the NX bit
properly.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen d3cf7f0615 [PATCH] Remove bogus warning from early_ioremap
It is correct for its only caller right now, but not for possible
future others.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen 151f8cc116 [PATCH] Remove safe_smp_processor_id()
And replace all users with ordinary smp_processor_id.  The function
was originally added to get some basic oops information out even
if the GS register was corrupted. However that didn't
work for some anymore because printk is needed to print the oops
and it uses smp_processor_id() already. Also GS register corruptions
are not particularly common anymore.

This also helps the Xen port which would otherwise need to
do this in a special way because it can't access the local APIC.

Cc: Chris Wright <chrisw@sous-sol.org>

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Rafael J. Wysocki 34464a5b89 [PATCH] Detect clock skew during suspend
Detect the situations in which the time after a resume from disk would
be earlier than the time before the suspend and prevent them from
happening on x86_64.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andrew Morton 1164c9994f [PATCH] make numa_emulation() __init
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen dbf9272e86 [PATCH] Don't force reserve the 640k-1MB range
From i386 x86-64 inherited code to force reserve the 640k-1MB area.
That was needed on some old systems.

But we generally trust the e820 map to be correct on 64bit systems
and mark all areas that are not memory correctly.

This patch will allow to use the real memory in there.

Or rather the only way to find out if it's still needed is to
try. So far I'm optimistic.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:36 +02:00
Magnus Damm ed77504b20 [PATCH] mark init_amd() as __cpuinit
The init_amd() function is only called from identify_cpu() which is already
marked as __cpuinit. So let's mark it as __cpuinit.

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:36 +02:00
Keith Mannthey 6ad9165811 [PATCH] x86_64 kernel mapping fix
Fix for the x86_64 kernel mapping code.  Without this patch the update path
only inits one pmd_page worth of memory and tramples any entries on it.  now
the calling convention to phys_pmd_init and phys_init is to always pass a
[pmd/pud] page not an offset within a page.

Signed-off-by: Keith Mannthey<kmannth@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:36 +02:00
Andrew Morton abf0f10948 [PATCH] wire up oops_enter()/oops_exit()
Implement pause_on_oops() on x86_64.

AK: I redid the patch to do the oops_enter/exit in the existing
oops_begin()/end(). This makes it much shorter.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:36 +02:00
Arjan van de Ven e07e23e1fd [PATCH] non lazy "sleazy" fpu implementation
Right now the kernel on x86-64 has a 100% lazy fpu behavior: after *every*
context switch a trap is taken for the first FPU use to restore the FPU
context lazily.  This is of course great for applications that have very
sporadic or no FPU use (since then you avoid doing the expensive
save/restore all the time).  However for very frequent FPU users...  you
take an extra trap every context switch.

The patch below adds a simple heuristic to this code: After 5 consecutive
context switches of FPU use, the lazy behavior is disabled and the context
gets restored every context switch.  If the app indeed uses the FPU, the
trap is avoided.  (the chance of the 6th time slice using FPU after the
previous 5 having done so are quite high obviously).

After 256 switches, this is reset and lazy behavior is returned (until
there are 5 consecutive ones again).  The reason for this is to give apps
that do longer bursts of FPU use still the lazy behavior back after some
time.

[akpm@osdl.org: place new task_struct field next to jit_keyring to save space]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:36 +02:00
Brice Goglin 40bee2ee73 [PATCH] fix bus numbering format in mmconfig warning
Make an mmconfig warning print the bus id with a regular format.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:35 +02:00
Eric W. Biederman ba4d40bb5c [PATCH] Auto size the per cpu area.
Now for a completely different but trivial approach.
I just boot tested it with 255 CPUS and everything worked.

Currently everything (except module data) we place in
the per cpu area we know about at compile time.  So
instead of allocating a fixed size for the per_cpu area
allocate the number of bytes we need plus a fixed constant
for to be used for modules.

It isn't perfect but it is much less of a pain to
work with than what we are doing now.

AK: fixed warning

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:35 +02:00
Andi Kleen 273819a2d9 [PATCH] make fault notifier unconditional and export it
It's needed for external debuggers and overhead is very small.

Also make the actual notifier chain they use static

Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:35 +02:00
Andi Kleen 2717941c6a [PATCH] Make boot_param_data pure BSS
Since it's all zero.

Actually I think gcc 4+ will do that automatically, but earlier compilers won't
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:35 +02:00
Andi Kleen 1edf777803 [PATCH] i386/x86-64: Improve Kconfig description of CRASH_DUMP
Improve Kconfig description of CONFIG_CRASH_DUMP. Previously
it was too brief to be useful.

Cc: vgoyal@in.ibm.com
Cc: ebiederm@xmission.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:35 +02:00
Dimitri Sivanich cbf9b4bb76 [PATCH] X86_64 monotonic_clock goes backwards
I've noticed some erratic behavior while testing the X86_64 version
of monotonic_clock().

While spinning in a loop reading monotonic clock values (pinned to a
single cpu) I noticed that the difference between subsequent values
occasionally went negative (time going backwards).

I found that in the following code:
                this_offset = get_cycles_sync();
                /* FIXME: 1000 or 1000000? */
-->             offset = (this_offset - last_offset)*1000 / cpu_khz;
        }
        return base + offset;

the offset sometimes turns out to be 0, even though
this_offset > last_offset.

+Added fix From: Toyo Abe <toyoa@mvista.com>

The x86_64-mm-monotonic-clock.patch in 2.6.18-rc4-mm2 made a change to
the updating of monotonic_base. It now uses cycles_2_ns().

I suggest that a set_cyc2ns_scale() should be done prior to the setup_irq().
Because cycles_2_ns() can be called from the timer ISR right after the irq0
is enabled.

Signed-off-by: Toyo Abe <toyoa@mvista.com>
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Prasanna S.P d28c4393a7 [PATCH] x86: error_code is not safe for kprobes
This patch moves the entry.S:error_entry to .kprobes.text section,
since code marked unsafe for kprobes jumps directly to entry.S::error_entry,
that must be marked unsafe as well.
This patch also moves all the ".previous.text" asm directives to ".previous"
for kprobes section.

AK: Following a similar i386 patch from Chuck Ebbert
AK: Also merged Jeremy's fix in.

+From: Jeremy Fitzhardinge <jeremy@goop.org>

KPROBE_ENTRY does a .section .kprobes.text, and expects its users to
do a .previous at the end of the function.

Unfortunately, if any code within the function switches sections, for
example .fixup, then the .previous ends up putting all subsequent code
into .fixup.  Worse, any subsequent .fixup code gets intermingled with
the code its supposed to be fixing (which is also in .fixup).  It's
surprising this didn't cause more havok.

The fix is to use .pushsection/.popsection, so this stuff nests
properly.  A further cleanup would be to get rid of all
.section/.previous pairs, since they're inherently fragile.

+From: Chuck Ebbert <76306.1226@compuserve.com>

Because code marked unsafe for kprobes jumps directly to
entry.S::error_code, that must be marked unsafe as well.
The easiest way to do that is to move the page fault entry
point to just before error_code and let it inherit the same
section.

Also moved all the ".previous" asm directives for kprobes
sections to column 1 and removed ".text" from them.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Andi Kleen be7a91709b [PATCH] Check for end of stack trace before falling back
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Andi Kleen c0b766f13d [PATCH] Merge stacktrace and show_trace
This unifies the standard backtracer and the new stacktrace
in memory backtracer. The standard one is converted to use callbacks
and then reimplement stacktrace using new callbacks.

The main advantage is that stacktrace can now use the new dwarf2 unwinder
and avoid false positives in many cases.

I kept it simple to make sure the standard backtracer stays reliable.

Cc: mingo@elte.hu

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Andi Kleen b7f5e3c774 [PATCH] Don't access the APIC in safe_smp_processor_id when it is not mapped yet
Lockdep can call the dwarf2 unwinder early, and the dwarf2 code
uses safe_smp_processor_id which tries to access the local APIC page.
But that doesn't work before the APIC code has set up its fixmap.

Check for this case and always return boot cpu then.

Cc: jbeulich@novell.com
Cc: mingo@elte.hu

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Andi Kleen 5a1b3999d6 [PATCH] x86: Some preparationary cleanup for stack trace
- Remove unused all_contexts parameter
No caller used it
- Move skip argument into the structure (needed for
followon patches)

Cc: mingo@elte.hu

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Muli Ben-Yehuda 4ea8a5d8b5 [PATCH] Calgary IOMMU: eradicate sole remaining 80 chars per line offender
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Muli Ben-Yehuda 4ccf4ae314 [PATCH] remove tce_cache_blast_stress()
tce_cache_blast_stress was useful during bringup to stress the IOMMU's
cache flushing. Now that we quiesce DMAs on every cache flush, using
_stress() brings the machine down to its knees once you put it under
load. Remove this debug / bringup code that isn't useful anymore
completely.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Muli Ben-Yehuda 796e4390e0 [PATCH] only verify the allocation bitmap if CONFIG_IOMMU_DEBUG is on
Introduce new function verify_bit_range(). Define two versions, one
for CONFIG_IOMMU_DEBUG enabled and one for disabled. Previously we
were checking that the bitmap was consistent every time we allocated
or freed an entry in the TCE table, which is good for debugging but
incurs an unnecessary penalty on non debug builds.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Muli Ben-Yehuda de684652f3 [PATCH] print whether CONFIG_IOMMU_DEBUG is enabled
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Chuck Ebbert 2ade2920dc [PATCH] i386/x86-64: rename is_at_popf(), add iret to tests and fix
is_at_popf() needs to test for the iret instruction as well as
popf.  So add that test and rename it to is_setting_trap_flag().

Also change max insn length from 16 to 15 to match reality.

LAHF / SAHF can't affect TF, so the comment in x86_64 is removed.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Fernando Luis Vázquez Cao 2b94ab2fd5 [PATCH] Replace local_save_flags+local_irq_disable with
The combination of "local_save_flags" and "local_irq_disable" seems to be
equivalent to "local_irq_save" (see code snips below). Consequently, replace
occurrences of local_save_flags+local_irq_disable with local_irq_save.

* local_irq_save
#define raw_local_irq_save(flags) \
                do { (flags) = __raw_local_irq_save(); } while (0)

static inline unsigned long __raw_local_irq_save(void)
{
        unsigned long flags = __raw_local_save_flags();

        raw_local_irq_disable();

        return flags;
}

* local_save_flags
#define raw_local_save_flags(flags) \
                do { (flags) = __raw_local_save_flags(); } while (0)

Signed-off-by: Fernando Vazquez <fernando@intellilink.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen 52d522f53f [PATCH] Fix sparse warnings in compat aout code
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen ddb15ec130 [PATCH] Fix most sparse warnings in sys_ia32.c
Mostly by adding casts.

I didn't touch the "invalid access past ..." which are caused
by the sigset conversion.
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen dd2994f619 [PATCH] Add sparse annotations to quiet sparse in arch/x86_64/mm/fault.c
Fixes

linux/arch/x86_64/mm/fault.c:125:7: warning: incorrect type in argument 1 (different address spaces)
linux/arch/x86_64/mm/fault.c:125:7:    expected void [noderef] *<noident><asn:1>
linux/arch/x86_64/mm/fault.c:125:7:    got unsigned char *[assigned] instr
linux/arch/x86_64/mm/fault.c:163:8: warning: incorrect type in argument 1 (different address spaces)
linux/arch/x86_64/mm/fault.c:163:8:    expected void [noderef] *<noident><asn:1>
linux/arch/x86_64/mm/fault.c:163:8:    got unsigned char *[assigned] instr
linux/arch/x86_64/mm/fault.c:179:9: warning: incorrect type in argument 1 (different address spaces)
linux/arch/x86_64/mm/fault.c:179:9:    expected void [noderef] *<noident><asn:1>
linux/arch/x86_64/mm/fault.c:179:9:    got unsigned long *<noident>

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen 131cfd7bd5 [PATCH] Add sparse annotation to vsyscall.c
Fixes

linux/arch/x86_64/kernel/vsyscall.c:276:7: warning: constant 0x0f40000000000 is so big it is long
linux/arch/x86_64/kernel/vsyscall.c:80:14: warning: incorrect type in argument 1 (different address spaces)
linux/arch/x86_64/kernel/vsyscall.c:80:14:    expected void const volatile [noderef] *addr<asn:2>
linux/arch/x86_64/kernel/vsyscall.c:80:14:    got void *<noident>
linux/arch/x86_64/kernel/vsyscall.c:200:7: warning: incorrect type in assignment (different address spaces)
linux/arch/x86_64/kernel/vsyscall.c:200:7:    expected unsigned short [usertype] *map1
linux/arch/x86_64/kernel/vsyscall.c:200:7:    got void [noderef] *<asn:2>
linux/arch/x86_64/kernel/vsyscall.c:203:7: warning: incorrect type in assignment (different address spaces)
linux/arch/x86_64/kernel/vsyscall.c:203:7:    expected unsigned short [usertype] *map2
linux/arch/x86_64/kernel/vsyscall.c:203:7:    got void [noderef] *<asn:2>
linux/arch/x86_64/kernel/vsyscall.c:215:10: warning: incorrect type in argument 1 (different address spaces)
linux/arch/x86_64/kernel/vsyscall.c:215:10:    expected void volatile [noderef] *addr<asn:2>
linux/arch/x86_64/kernel/vsyscall.c:215:10:    got unsigned short [usertype] *map2
linux/arch/x86_64/kernel/vsyscall.c:217:10: warning: incorrect type in argument 1 (different address spaces)
linux/arch/x86_64/kernel/vsyscall.c:217:10:    expected void volatile [noderef] *addr<asn:2>
linux/arch/x86_64/kernel/vsyscall.c:217:10:    got unsigned short [usertype] *map1

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen 3bd4d18cba [PATCH] Move e820 map into e820.c
Minor cleanup. Keep setup.c free from unrelated clutter.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen c31fbb1ad8 [PATCH] Clean up acpi_numa variable
Move it into srat.c No need to clutter up setup.c for it

And remove use in setup.c completely - it only guarded a printk
which can be done unconditionally.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen df3bb57d2c [PATCH] i386/x86-64: Move acpi_disabled variables into acpi/boot.c
Removes code duplication between i386/x86-64.

Not needed anymore in setup.c since early_param cleanup

Cc: len.brown@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen 43c85c9c5d [PATCH] Remove need for early lockdep init
I think it was only needed for the printks and we can do them later.

I put in a single early_printk so that we know the kernel is alive
(early_printk doesn't need any locks)

This makes some things easier for initialization of unwind for
lockdep, which is needed by later patches.

cc: mingo@elte.hu

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Andi Kleen 2c8c0e6b8d [PATCH] Convert x86-64 to early param
Instead of hackish manual parsing

Requires earlier i386 patchkit, but also fixes i386 early_printk again.

I removed some obsolete really early parameters which didn't do anything useful.
Also made a few parameters that needed it early (mostly oops printing setup)

Also removed one panic check that wasn't visible without
early console anyways (the early console is now initialized after that
panic)

This cleans up a lot of code.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Andi Kleen 9ca33eb698 [PATCH] Use early CPU identify before early command line parsing
This makes it possible to modify CPU flags in command line
options without hacks.

And remove another copy in head64.c

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Chuck Ebbert d4d35854a1 [PATCH] remove lock prefix from is_at_popf() tests
The lock prefix will cause an exception when used with the
popf instruction, so no need to continue searching after it's
found.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Muli Ben-Yehuda 145106e810 [PATCH] remove superflous BUG_ON's in nommu and gart
There's no need to check for invalid DMA data direction in nommu and
gart since we do it in dma-mapping.h anyway before calling the
individual dma-ops.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Eric W. Biederman 29a6c25bd6 [PATCH] Fix gdt table size in trampoline.S
Allows easier extension of the GDT by using the proper C symbol
for the size in the descriptor.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Jan Beulich 8d379dad8f [PATCH] annotate arch/x86_64/lib/*.S
Add unwind annotations to arch/x86_64/lib/*.S, and also use the macros
provided by linux/linkage.h where-ever possible.

Some of the alternative instructions handling needed to be adjusted so
that the replacement code would also have valid unwind information.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Chuck Ebbert a752d7194c [PATCH] fix is_at_popf() for compat tasks
When testing for the REX instruction prefix, first check
for 32-bit mode because in compat mode the REX prefix is an
increment instruction.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Muli Ben-Yehuda 0577f148b5 [PATCH] Calgary IOMMU: save a bit of space in bus_info
Make translation_disabled a uchar rather than an int

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda a4fc520a0f [PATCH] Calgary IOMMU: calgary_init_one_nontraslated() can return void
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda 871b17008e [PATCH] Calgary IOMMU: fix reference counting of Calgary PCI devices
The pci_get_device() API decrements the reference count on the 'from'
parameter when it continues searching. Therefore, take a ref count on
Calgary bus when we initialize them in either translated or
non-translated mode.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda b8f4fe66a5 [PATCH] Calgary IOMMU: fix error path memleak in calgary_free_tar
We were freeing the iommu_table and leaking the bitmap pages. Also
rename it to calgary_free_bus, which is more accurate.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda 9f2dc46d5e [PATCH] Calgary IOMMU: break out of pci_find_device_reverse if dev not found
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda f38db651d5 [PATCH] Calgary IOMMU: consolidate per bus data structures
Move the tce_table_kva array, disabled bitmap and bus_to_phb array
into a new per bus 'struct calgary_bus_info'. Also slightly reorganize
build_tce_table and tce_table_setparms to avoid exporting bus_info to
tce.c.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Jan Beulich 3b94355c47 [PATCH] remove int_delivery_dest
The genapic field and the accessor macro weren't used anywhere.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Jan Beulich caff0710eb [PATCH] initialize end of memory variables as early as possible
While an earlier patch already did a small step into that direction,
this patch moves initialization of all memory end variables to as
early as possible, so that dependent code doesn't need to check
whether these variables have already been set.

Also, remove a misleading (perhaps just outdated) comment, and make
static a variable only used in a single file.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Andi Kleen 44cc45267b [PATCH] Remove obsolete CVS $Id$ from assembler files in arch/x86_64/kernel/*
CVS hasn't been used for a long time for them.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Andi Kleen e2414910f2 [PATCH] x86: Detect CFI support in the assembler at runtime
... instead of using a CONFIG option. The config option still controls
if the resulting executable actually has unwind information.

This is useful to prevent compilation errors when users select
CONFIG_STACK_UNWIND on old binutils and also allows to use
CFI in the future for non kernel debugging applications.

Cc: jbeulich@novell.com
Cc: sam@ravnborg.org

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen fe7414a288 [PATCH] Use BUILD_BUG_ON in apic.c build sanity checking
Makes code a little shorter.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen efec3b9a32 [PATCH] Fix up some non linuxy style in ACPI functions in mpparse.c
No functional changes.

Cc: len.brown@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen 276ec1a76a [PATCH] Remove some unneeded ACPI externs in mpparse.c
They are not used in this file so remove them. i386 didn't have them either.

Cc: len.brown@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen a01fd3baff [PATCH] Remove useless wrapper in mpparse.c code
It used to contain support code for NUMAQ, but that is long gone already
on 64bit.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen 55f05ffaa7 [PATCH] Replace mp bus array with bitmap for bus not pci
Since we only support PCI and ISA legacy busses now there is no need to
have an full array with checking.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen dfa4698c50 [PATCH] Move early chipset quirks out to new file
They did not really belong into io_apic.c. Move them into a new file
and clean it up a bit.

Also remove outdated ATI quirk that was obsolete,

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen edd9652296 [PATCH] Remove MPS table APIC renumbering
The MPS table specification says that the operating system should
renumber the IO-APICs following the table as needed.  However in
ACPI this is not allowed or neeeded and all x86-64 systems are ACPI
compliant.

The code was already disabled on some systems because it caused
problems there. Remove it completely now.

CC: mdomsch@dell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Diego Calleja 606bd58de6 [PATCH] x86: AUX_DEVICE_INFO is one byte long, use 'movb'
Bugzilla #6552 says:

"In arch/i386/boot/setup.S, movw is used instead of movb for PS/2 mouse
information, although it is unsigned char. This does not harm, because
the jmp instruction overwritten by movw is used before executing movw,
and never be used again"

I've no idea if this is a real bug or how it gets fixed, so I'm submitting
it for review instead of letting it die of boredom in bugzilla. Aditionally
to i386, I've changed x86-64, which mirrors the same code.

Credits to Yoshinori K. Okuji, who found the problem and suggested a fix.

Signed-off-by: Diego Calleja <diegocg@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen eea0e11c1f [PATCH] Factor out common io apic routing entry access
The IO APIC code had lots of duplicated code to read/write 64bit
routing entries into the IO-APIC. Factor this out int common read/write
functions

In a few cases the IO APIC lock is taken more often now, but this
isn't a problem because it's all initialization/shutdown only
slow path code.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen c1a58b42b4 [PATCH] i386/x86-64: Remove obsolete sanity check in mptable parsing
It apparently has never triggered in many years.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen a8fcf1a24a [PATCH] Remove obsolete PIC mode
PIC mode is an outdated way to drive the APICs that was used on
some early MP boards. It is not supported in the ACPI model.

It is unlikely to be ever configured by any x86-64 system

Remove it thus.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen e509913434 [PATCH] Remove leftover MCE/EISA support
No 64bit EISA or Microchannel systems ever. Remove the left over code
in the IO-APIC driver and the mptable parser

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 5cb6b99928 [PATCH] Remove pirq overwrite support
This was an old workaround for broken MP-BIOS. The user could
specify overwrites on the command line.

I've never seen it being used for anything on 64bit. So get
rid of it for now.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 2e91a17b35 [PATCH] Add some comments to entry.S
And remove some old obsolete ones.
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 3f14c746a6 [PATCH] Remove old "focus disabled" chipset errata workaround
The new systems already use focus disabled and the comment was
completely outdated.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 6c96a29f20 [PATCH] Remove apic mismatch counter
Nobody has been setting the mismatch counter and the ifdef was never
set so remove it.
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 7f11d8a5ef [PATCH] Remove all ifdefs for local/io apic
IO-APIC or local APIC can only be disabled at runtime anyways and
Kconfig has forced these options on for a long time now.

The Kconfigs are kept only now for the benefit of the shared acpi
boot.c code.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 5ba5891d44 [PATCH] Add some comments what tce.c actually does
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen cc1e684a9f [PATCH] Remove leftover CVS Id in thunk.S
And move the comment to a proper place.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen b06babac45 [PATCH] Add proper alignment to ENTRY
Previously it didn't align. Use the same one as the C compiler
in blended mode, which is good for K8 and Core2 and doesn't hurt
on P4.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 9a0b26e6bc [PATCH] Clean up read write lock assembly
- Move the slow path fallbacks to their own assembly files
This makes them much easier to read and is needed for the next change.
- Add CFI annotations for unwinding (XXX need review)
- Remove constant case which can never happen with out of line spinlocks
- Use patchable LOCK prefixes
- Don't use lock sections anymore for inline code because they can't
be expressed by the unwinder (this adds one taken jump to the lock
fast path)

Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Andi Kleen 31679f38d8 [PATCH] Simplify profile_pc on x86-64
Use knowledge about EFLAGS layout (bits 22:63 are always 0) to distingush
EFLAGS word and kernel address in the spin lock stack frame.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Andi Kleen c16b63e09d [PATCH] i386/x86-64: Don't randomize stack top when no randomization personality is set
Based on patch from Frank van Maarseveen <frankvm@frankvm.com>, but
extended.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Adam Henley d5d9ca6d88 [PATCH] A few trivial spelling and grammar fixes
A few trivial spelling and grammar mistakes picked up in
"arch/x86_64/aperture.c", "arch/x86_64/crash.c" and
"arch/x86_64/apic.c". I think all are correct fixes but am ever aware
of my fallibility :o) This is my first patch submission so all
feedback is appreciated, esp. WRT CCing to Linus, Andi and
trivial@kernel.org, is this correct? And which is the most appropriate
kernel version to diff against? If any.

Should apply cleanly to 2.6.18-rc1

Signed-off-by: Adam Henley <adamazing@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>

-  adam
2006-09-26 10:52:28 +02:00
Stephane Eranian d3a4f48d48 [PATCH] x86-64 TIF flags for debug regs and io bitmap in ctxsw
Hello,

Following my discussion with Andi. Here is a patch that introduces
two new TIF flags to simplify the context switch code in __switch_to().
The idea is to minimize the number of cache lines accessed in the common
case, i.e., when neither the debug registers nor the I/O bitmap are used.

This patch covers the x86-64 modifications. A patch for i386 follows.

Changelog:
	- add TIF_DEBUG to track when debug registers are active
	- add TIF_IO_BITMAP to track when I/O bitmap is used
	- modify __switch_to() to use the new TIF flags

<signed-off-by>: eranian@hpl.hp.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Andi Kleen 2f766d1606 [PATCH] Clean up asm/smp.h includes
No need to include it from entry.S
Drop all the #ifdef __ASSEMBLY__

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Andi Kleen 3cfc348bf9 [PATCH] x86: Add portable getcpu call
For NUMA optimization and some other algorithms it is useful to have a fast
to get the current CPU and node numbers in user space.

x86-64 added a fast way to do this in a vsyscall. This adds a generic
syscall for other architectures to make it a generic portable facility.

I expect some of them will also implement it as a faster vsyscall.

The cache is an optimization for the x86-64 vsyscall optimization. Since
what the syscall returns is an approximation anyways and user space
often wants very fast results it can be cached for some time.  The norma
methods to get this information in user space are relatively slow

The vsyscall is in a better position to manage the cache because it has direct
access to a fast time stamp (jiffies). For the generic syscall optimization
it doesn't help much, but enforce a valid argument to keep programs
portable

I only added an i386 syscall entry for now. Other architectures can follow
as needed.

AK: Also added some cleanups from Andrew Morton

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Vojtech Pavlik c08c820508 [PATCH] Add the vgetcpu vsyscall
This patch adds a vgetcpu vsyscall, which depending on the CPU RDTSCP
capability uses either the RDTSCP or CPUID to obtain a CPU and node
numbers and pass them to the program.

AK: Lots of changes over Vojtech's original code:
Better prototype for vgetcpu()
It's better to pass the cpu / node numbers as separate arguments
to avoid mistakes when going from SMP to NUMA.
Also add a fast time stamp based cache using a user supplied
argument to speed things more up.
Use fast method from Chuck Ebbert to retrieve node/cpu from
GDT limit instead of CPUID
Made sure RDTSCP init is always executed after node is known.
Drop printk

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Vojtech Pavlik a670fad0ad [PATCH] Add initalization of the RDTSCP auxilliary values
This patch adds initalization of the RDTSCP auxilliary values to CPU numbers
to time.c. If RDTSCP is available, the MSRs are written with the respective
values. It can be later used to initalize per-cpu timekeeping variables.

AK: Some cleanups. Move externs into headers and fix CPU hotplug.

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Venkatesh Pallipadi 248dcb2fff [PATCH] x86: i386/x86-64 Add nmi watchdog support for new Intel CPUs
AK: This redoes the changes I temporarily reverted.

Intel now has support for Architectural Performance Monitoring Counters
( Refer to IA-32 Intel Architecture Software Developer's Manual
http://www.intel.com/design/pentium4/manuals/253669.htm ). This
feature is present starting from Intel Core Duo and Intel Core Solo processors.

What this means is, the performance monitoring counters and some performance
monitoring events are now defined in an architectural way (using cpuid).
And there will be no need to check for family/model etc for these architectural
events.

Below is the patch to use this performance counters in nmi watchdog driver.
Patch handles both i386 and x86-64 kernels.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Vivek Goyal 3f22c5789e [PATCH] kdump x86_64 nmi event notification fix
After a crash we should wait for NMI IPI event and not for external NMI or
NMI watchdog tick.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Andi Kleen fac58550e8 [PATCH] Fix up panic messages for different NMI panics
When a unknown NMI happened the panic would claim a NMI watchdog timeout.
Also it would check the variable set by nmi_watchdog=panic and panic then.

Fix up the panic message to be generic
Unconditionally panic on unknown NMI when panic on unknown nmi is enabled.

Noticed by Jan Beulich

Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Shaohua Li 4038f901cf [PATCH] i386/x86-64: Fix NMI watchdog suspend/resume
Making NMI suspend/resume work with SMP. We use CPU hotplug to offline
APs in SMP suspend/resume. Only BSP executes sysdev's .suspend/.resume
method. APs should follow CPU hotplug code path.

And:

+From: Don Zickus <dzickus@redhat.com>

Makes the start/stop paths of nmi watchdog more robust to handle the
suspend/resume cases more gracefully.

AK: I merged the two patches together

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Don Zickus c41c5cd3b2 [PATCH] x86: x86 clean up nmi panic messages
Clean up some of the output messages on the nmi error paths to make more
sense when they are displayed.  This is mainly a cosmetic fix and
shouldn't impact any normal code path.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus 8da5adda91 [PATCH] x86: Allow users to force a panic on NMI
To quote Alan Cox:

The default Linux behaviour on an NMI of either memory or unknown is to
continue operation. For many environments such as scientific computing
it is preferable that the box is taken out and the error dealt with than
an uncorrected parity/ECC error get propogated.

A small number of systems do generate NMI's for bizarre random reasons
such as power management so the default is unchanged. In other respects
the new proc/sys entry works like the existing panic controls already in
that directory.

This is separate to the edac support - EDAC allows supported chipsets to
handle ECC errors well, this change allows unsupported cases to at least
panic rather than cause problems further down the line.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus e33e89ab1a [PATCH] x86: Add abilty to enable/disable nmi watchdog from procfs (update)
Adds a new /proc/sys/kernel/nmi_watchdog call that will enable/disable the
nmi watchdog.

By entering a non-zero value here, a user can enable the nmi watchdog to
monitor the online cpus in the system.  By entering a zero value here, a
user can disable the nmi watchdog and free up a performance counter which
could then be utilized by the oprofile subsystem, otherwise oprofile may be
short a counter when in use.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Don Zickus 407984f1af [PATCH] x86: Add abilty to enable/disable nmi watchdog with sysctl
Adds a new /proc/sys/kernel/nmi call that will enable/disable the nmi
watchdog.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus 2fbe7b25c8 [PATCH] i386/x86-64: Remove un/set_nmi_callback and reserve/release_lapic_nmi functions
Removes the un/set_nmi_callback and reserve/release_lapic_nmi functions as
they are no longer needed.  The various subsystems are modified to register
with the die_notifier instead.

Also includes compile fixes by Andrew Morton.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Andi Kleen 957dc87c1b [PATCH] Add ppoll/pselect syscalls
Needed TIF_RESTORE_SIGMASK first

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Andi Kleen 1d001df19d [PATCH] Add TIF_RESTORE_SIGMASK
We need TIF_RESTORE_SIGMASK in order to support ppoll() and pselect()
system calls. This patch originally came from Andi, and was based
heavily on David Howells' implementation of same on i386. I fixed a typo
which was causing do_signal() to use the wrong signal mask.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus 3adbbcce9a [PATCH] x86: Cleanup NMI interrupt path
This patch cleans up the NMI interrupt path.  Instead of being gated by if
the 'nmi callback' is set, the interrupt handler now calls everyone who is
registered on the die_chain and additionally checks the nmi watchdog,
reseting it if enabled.  This allows more subsystems to hook into the NMI if
they need to (without being block by set_nmi_callback).

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus f2802e7f57 [PATCH] Add SMP support on x86_64 to reservation framework
This patch includes the changes to make the nmi watchdog on x86_64 SMP
aware.  A bunch of code was moved around to make it simpler to read.  In
addition, it is now possible to determine if a particular NMI was the result
of the watchdog or not.  This feature allows the kernel to filter out
unknown NMIs easier.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus 828f0afda1 [PATCH] x86: Add performance counter reservation framework for UP kernels
Adds basic infrastructure to allow subsystems to reserve performance
counters on the x86 chips.  Only UP kernels are supported in this patch to
make reviewing easier.  The SMP portion makes a lot more changes.

Think of this as a locking mechanism where each bit represents a different
counter.  In addition, each subsystem should also reserve an appropriate
event selection register that will correspond to the performance counter it
will be using (this is mainly neccessary for the Pentium 4 chips as they
break the 1:1 relationship to performance counters).

This will help prevent subsystems like oprofile from interfering with the
nmi watchdog.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Andi Kleen b07f8915cd [PATCH] x86: Temporarily revert parts of the Core 2 nmi nmi watchdog support
This makes merging easier.  They are readded a few patches later.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Andi Kleen 265baba316 [PATCH] Update defconfig
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Herbert Xu 560c06ae1a [CRYPTO] api: Get rid of flags argument to setkey
Now that the tfm is passed directly to setkey instead of the ctx, we no
longer need to pass the &tfm->crt_flags pointer.

This patch also gets rid of a few unnecessary checks on the key length
for ciphers as the cipher layer guarantees that the key length is within
the bounds specified by the algorithm.

Rather than testing dia_setkey every time, this patch does it only once
during crypto_alloc_tfm.  The redundant check from crypto_digest_setkey
is also removed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:41:02 +10:00
Joachim Fritschi eaf44088ff [CRYPTO] twofish: x86-64 assembly version
The patch passed the trycpt tests and automated filesystem tests.
This rewrite resulted in some nice perfomance increase over my last patch.

Short summary of the tcrypt benchmarks:

Twofish Assembler vs. Twofish C (256bit 8kb block CBC)
encrypt: -27% Cycles
decrypt: -23% Cycles

Twofish Assembler vs. AES Assembler (128bit 8kb block CBC)
encrypt: +18%  Cycles
decrypt: +15% Cycles

Twofish Assembler vs. AES Assembler (256bit 8kb block CBC)
encrypt: -9% Cycles
decrypt: -8% Cycles

Full Output:
http://homepages.tu-darmstadt.de/~fritschi/twofish/tcrypt-speed-twofish-c-x86_64.txt
http://homepages.tu-darmstadt.de/~fritschi/twofish/tcrypt-speed-twofish-asm-x86_64.txt
http://homepages.tu-darmstadt.de/~fritschi/twofish/tcrypt-speed-aes-asm-x86_64.txt


Here is another bonnie++ benchmark with encrypted filesystems. Most runs maxed
out the hd. It should give some idea what the module can do for encrypted filesystem
performance even though you can't see the full numbers.

http://homepages.tu-darmstadt.de/~fritschi/twofish/output_20060610_130806_x86_64.html

Signed-off-by: Joachim Fritschi <jfritschi@freenet.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:16:29 +10:00
Linus Torvalds 79e453d49b Revert mmiocfg heuristics and blacklist changes
This reverts commits 11012d419c and
40dd2d20f2, which allowed us to use the
MMIO accesses for PCI config cycles even without the area being marked
reserved in the e820 memory tables.

Those changes were needed for EFI-environment Intel macs, but broke some
newer Intel 965 boards, so for now it's better to revert to our old
2.6.17 behaviour and at least avoid introducing any new breakage.

Andi Kleen has a set of patches that work with both EFI and the broken
Intel 965 boards, which will be applied once they get wider testing.

Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-19 08:15:22 -07:00
Al Viro e65e1fc2d2 [PATCH] syscall class hookup for all normal targets
Take default arch/*/kernel/audit.c to lib/, have those with special
needs (== biarch) define AUDIT_ARCH in their Kconfig.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-09-12 03:04:40 -04:00
Al Viro 55669bfa14 [PATCH] audit: AUDIT_PERM support
add support for AUDIT_PERM predicate

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-09-11 13:32:30 -04:00
Al Viro dc104fb323 [PATCH] audit: more syscall classes added
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-09-11 13:32:27 -04:00
Suleiman Souhlal ec0063b40a [PATCH] x86_64: Don't write out segments from vsyscall32 DSO if it is not mapped
It's possible to get an invalid page fault in kernel mode when we try to
write out segments from vsyscall32 when dumping core for a 32bit process if
the vsyscall32 DSO is not mapped in its address space (which can happen if,
for example, ulimit -v 100 is run).

Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:16 -07:00
Keith Owens 01ebb77b31 [PATCH] x86_64: Save original IST values for checking stack addresses
The values in init_tss.ist[] can change when an IST event occurs.  Save
the original IST values for checking stack addresses when debugging or
doing stack traces.

Signed-off-by: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:16 -07:00
Andi Kleen 40dd2d20f2 [PATCH] x86: Disable MMCONFIG on Intel SDV using DMI blacklist
As a replacement for the earlier removal of the e820 MCFG check
we blacklist the Intel SDV with the original BIOS bug that
motivated that check. On those machines don't use MMCONFIG.

This also adds a new pci=mmconf parameter to override the blacklist.

Cc: Greg KH <gregkh@suse.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:16 -07:00
Andi Kleen ceee882230 [PATCH] x86_64: Recover 1MB of kernel memory
Noticed by Jan Beulich.

When the kernel was moved from 1MB to 2MB in 2.6.17 the kernel reservation
code wasn't adjusted and it still reserved starting with 1MB. This means 1MB always
were lost.

This patch fixes this by reserving only starting with _text.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Jan Beulich ea424055b7 [PATCH] x86: Make backtracer fallback logic more bullet-proof
The unwinder fallback logic still had potential for falling through to
the legacy stack trace code without printing an indication (at once
serving as a separator) of this.

Further, the stack pointer retrieval for the fallback should be as
restrictive as possible (in order to avoid having the legacy stack
tracer try to access invalid memory). The patch tightens that, but
this could certainly be further improved.

Also making the call_trace command line option now conditional upon
CONFIG_STACK_UNWIND (as it's meaningless otherwise).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Andi Kleen c05991ed12 [PATCH] x86_64: Add kernel thread stack frame termination for properly stopping stack unwinds.
One open question: Should these added pushes perhaps be made
conditional upon CONFIG_STACK_UNWIND or CONFIG_UNWIND_INFO?
[AK: Not needed -- these are all very slow paths]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Andi Kleen 11012d419c [PATCH] x86: Revert e820 MCFG heuristics
The check for the MCFG table being reserved in the e820 map was originally
added to detect a broken BIOS in a preproduction Intel SDV. However it also
breaks the Apple x86 Macs, which can't supply this properly, but need
a working MCFG. With this patch they wouldn't use the MCFG and not work.

After some discussion I think it's best to remove the heuristic again.
It also failed on some other boxes (although it didn't cause much
problems there because old style port access for PCI config space
still works as fallback), but the preproduction SDVs can just use
pci=nommcfg. Supporting production machines properly is more
important.

Edgar Hucek did all the debugging work.

Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Andi Kleen ddcf36511d [PATCH] x86_64: Update defconfig
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Horms 012c437d03 [PATCH] Change panic_on_oops message to "Fatal exception"
Previously the message was "Fatal exception: panic_on_oops", as introduced
in a recent patch whith removed a somewhat dangerous call to ssleep() in
the panic_on_oops path.  However, Paul Mackerras suggested that this was
somewhat confusing, leadind people to believe that it was panic_on_oops
that was the root cause of the fatal exception.  On his suggestion, this
patch changes the message to simply "Fatal exception".  A suitable oops
message should already have been displayed.

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-14 12:54:29 -07:00
Alexey Dobriyan 825e037fb8 [PATCH] Fix more per-cpu typos
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06 08:57:47 -07:00
Muli Ben-Yehuda a166222cde [PATCH] x86_64: Fix CONFIG_IOMMU_DEBUG
If CONFIG_IOMMU_DEBUG is set force_iommu defaults to 1. In the case
where no HW IOMMU is present in the machine and we end up using nommu,
leaving force_iommu set to 1 causes dma_alloc_coherent to do the wrong
thing. Therefore, if we end up using nommu, make sure force_iommu is
0.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-02 20:19:54 -07:00
Andi Kleen 2699500b31 [PATCH] x86_64: Fix backtracing for interrupt stacks
Re-add backlink for old style unwinder to stack switching.  Add proper
stack frame and CFI annotations to call_softirq

This prevents a oops when backtracing with fallback through the
interrupt stack top.

Suggested by Jan Beulich and Herbert Xu wanted it in 2.6.18.

Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-02 20:19:54 -07:00
Roland McGrath 0b0bf7a3cc [PATCH] vDSO hash-style fix
The latest toolchains can produce a new ELF section in DSOs and
dynamically-linked executables.  The new section ".gnu.hash" replaces
".hash", and allows for more efficient runtime symbol lookups by the
dynamic linker.  The new ld option --hash-style={sysv|gnu|both} controls
whether to produce the old ".hash", the new ".gnu.hash", or both.  In some
new systems such as Fedora Core 6, gcc by default passes --hash-style=gnu
to the linker, so that a standard invocation of "gcc -shared" results in
producing a DSO with only ".gnu.hash".  The new ".gnu.hash" sections need
to be dealt with the same way as ".hash" sections in all respects; only the
dynamic linker cares about their contents.  To work with older dynamic
linkers (i.e.  preexisting releases of glibc), a binary must have the old
".hash" section.  The --hash-style=both option produces binaries that a new
dynamic linker can use more efficiently, but an old dynamic linker can
still handle.

The new section runs afoul of the custom linker scripts used to build vDSO
images for the kernel.  On ia64, the failure mode for this is a boot-time
panic because the vDSO's PT_IA_64_UNWIND segment winds up ill-formed.

This patch addresses the problem in two ways.

First, it mentions ".gnu.hash" in all the linker scripts alongside ".hash".
 This produces correct vDSO images with --hash-style=sysv (or old tools),
with --hash-style=gnu, or with --hash-style=both.

Second, it passes the --hash-style=sysv option when building the vDSO
images, so that ".gnu.hash" is not actually produced.  This is the most
conservative choice for compatibility with any old userland.  There is some
concern that some ancient glibc builds (though not any known old production
system) might choke on --hash-style=both binaries.  The optimizations
provided by the new style of hash section do not really matter for a DSO
with a tiny number of symbols, as the vDSO has.  If someone wants to use
=gnu or =both for their vDSO builds and worry less about that
compatibility, just change the option and the linker script changes will
make any choice work fine.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:43 -07:00
Chandra Seetharaman be6b5a3505 [PATCH] cpu hotplug: use hotplug version of registration in late inits
Use hotplug version of register_cpu_notifier in late init functions.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:39 -07:00
Horms cea6a4ba8a [PATCH] panic_on_oops: remove ssleep()
This patch is part of an effort to unify the panic_on_oops behaviour across
all architectures that implement it.

It was pointed out to me by Andi Kleen that if an oops has occured in
interrupt context, then calling sleep() in the oops path will only cause a
panic, and that it would be really better for it not to be in the path at
all.

This patch removes the ssleep() call and reworks the console message
accordinly.  I have a slght concern that the resulting console message is
too long, feedback welcome.

For powerpc it also unifies the 32bit and 64bit behaviour.

Fror x86_64, this patch only updates the console message, as ssleep() is
already not present.

Signed-off-by: Horms <horms@verge.net.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:39 -07:00
Eric W. Biederman 2a8a3d5b65 [PATCH] machine_kexec.c: Fix the description of segment handling
One of my original comments in machine_kexec was unclear
and this should fix it.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Horms <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:38 -07:00
Andi Kleen 65f87d8a8a [PATCH] x86_64: Fix swiotlb=force
It was broken before. But having it is important as possible hardware
bug workaround.

And previously there was no way to force swiotlb if there is another IOMMU.
Side effect is that iommu=force won't force swiotlb anymore even if there
isn't another IOMMU.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Andi Kleen 355540f333 [PATCH] x86_64: Revert k8-bus.c northbridge access change
As Travis Betak points out it accesses the wrong northbridge subfunction
now. Switch back to the old code.

Cc: "Travis Betak" <betak@mpdtxmail.amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Jon Mason d2105b10fe [PATCH] x86_64: Calgary IOMMU - Multi-Node NULL pointer dereference fix
Calgary hits a NULL pointer dereference when booting in a multi-chassis
NUMA system.  See Redhat bugzilla number 198498, found by Konrad
Rzeszutek (konradr@redhat.com).

There are many issues that had to be resolved to fix this problem.
Firstly when I originally wrote the code to handle NUMA systems, I
had a large misunderstanding that was not corrected until now.  That was
that I thought the "number of nodes online" referred to number of
physical systems connected.  So that if NUMA was disabled, there
would only be 1 node and it would only show that node's PCI bus.
In reality if NUMA is disabled, the system displays all of the
connected chassis as one node but is only ignorant of the delays
in accessing main memory.  Therefore, references to num_online_nodes()
and MAX_NUMNODES are incorrect and need to be set to the maximum
number of nodes that can be accessed (which are 8).  I created a
variable, MAX_NUM_CHASSIS, and set it to 8 to fix this.

Secondly, when walking the PCI in detect_calgary, the code only
checked the first "slot" when looking to see if a device is present.
This will work for most cases, but unfortunately it isn't always the
case.  In the NUMA MXE drawers, there are USB devices present on the
3rd slot (with slot 1 being empty).  So, to work around this, all
slots (up to 8) are scanned to see if there are any devices present.

Lastly, the bus is being enumerated on large systems in a different
way the we originally thought.  This throws the ugly logic we had
out the window.  To more elegantly handle this, I reorganized the
kva array to be sparse (which removed the need to have any bus number
to kva slot logic in tce.c) and created a secondary space array to
contain the bus number to phb mapping.

With these changes Calgary boots on an x460 with 4 nodes with and
without NUMA enabled.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Muli Ben-Yehuda 089bbbcb36 [PATCH] x86_64: Calgary IOMMU - fix off by one error
Fixed off-by-one error in detect_calgary and calgary_init which will
cause arrays to overflow.  Also, removed impossible to hit BUG_ON.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Andi Kleen 0e5f61b00c [PATCH] x86_64: On Intel systems when CPU has C3 don't use TSC
On Intel systems generally the TSC stops in C3 or deeper,
so don't use it there. Follows similar logic on i386.

This should fix problems on Meroms.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Andi Kleen 260f659b23 [PATCH] x86_64: Update defconfig
Update defconfig

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Andi Kleen b13761ecd1 [PATCH] x86_64: Dump leftover backtrace entries when dwarf2 unwinder got stuck
The dwarf2 unwinder currently often gets stuck because a lot
of assembly code doesn't have proper dwarf2 annotiation yet.

This currently often happens with __down. Should fix this by
adding proper dwarf2 annotation to all inline assembly. However
until that's done we need a quick fix for 2.6.18 to avoid
incomplete backtraces.

So when this happens dump the rest of the stack with the old unwinder
instead of silently not dumping it. There was already a optional
"both" mode that dumped both, but that was too ugly.

I also clarified the headers for the different backtraces a bit.

Also add a clear error message for missing dwarf2
annotation that people can work on.

And I removed a dead variable left over from Ingo's changes.

Cc: mingo@elte.hu
Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-28 19:28:00 -07:00
Andi Kleen 0e92da4acb [PATCH] x86_64: Don't clobber r8-r11 in int 0x80 handler
When int 0x80 is called from long mode r8-r11 would leak out of the
kernel (or rather they would be filled with some values from
the kernel stack). I don't think it's a security issue because
the values come from the fixed stack frame which should be near
always user registers from a previous interrupt.

Still better fix it.

Longer term the register save macros need to be cleaned up
to avoid such mistakes in the future.

Original analysis from Richard Brunner, fix by me.

Cc: Richard.Brunner@amd.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-28 19:28:00 -07:00
Andi Kleen d5a2601734 [PATCH] i386/x86-64: Add user_mode checks to profile_pc for oprofile
Fixes a obscure user space triggerable crash during oprofiling.

Oprofile calls profile_pc from NMIs even when user_mode(regs) is not true and
the program counter is inside the kernel lock section. This opens
a race - when a user program jumps to a kernel lock address and
a NMI happens before the illegal page fault exception is raised
and the program has a unmapped esp or ebp then the kernel could
oops. NMIs have a higher priority than exceptions so that could
happen.

Add user_mode checks to i386/x86-64 profile_pc to prevent that.

Cc: John Levon <levon@movementarian.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-28 19:28:00 -07:00
Andi Kleen 2c87e2cd0b [PATCH] x86_64: Fix access check in ptrace compat
We can't safely directly access an compat_alloc_user_space() pointer
with the siginfo copy functions. Bounce it through the stack.

Noticed by Al Viro using sparse

[ This was only added post 2.6.17, not in any released kernel ]

Cc: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:33 -07:00
Muli Ben-Yehuda aa0a9f373e [PATCH] x86_64: Fix Calgary copyright statements per IBM guidelines
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:33 -07:00
Jacob Shin 0d2caebd56 [PATCH] x86_64: Fix hotplug problem in mce amd
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:32 -07:00
Markus Schoder 3391c22e5b [PATCH] x86_64: Bring x86-64 ia32 emul in sync with i386 on READ_IMPLIES_EXEC enabling
Currently ia32 binaries behave differently with respect to enabling
READ_IMPLIES_EXEC.  On i386 a binary with the exec_stack flag set is
executed with READ_IMPLIES_EXEC enabled as well.  The same binary
executes without READ_IMPLIES_EXEC on x86-64.

This causes binaries that work on i386 to fail on x86-64 which goes
somewhat against the whole 32 bit emulation idea.

It has been argued that READ_IMPLIES_EXEC should not be enabled at all
for binaries that have the exec_stack flag.  Which is probably a valid
point.  However until this is clarified I think x86-64 should behave the
same for ia32 binaries as i386.

The following patch brings x86-64 in sync with i386 for ia32 binaries.

Signed-off-by: Markus Schoder <lists@gammarayburst.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:32 -07:00
Andi Kleen d5d8ad78b0 [PATCH] x86_64: Update defconfig
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:32 -07:00
Jon Smirl 894673ee61 [PATCH] tty: Remove include of screen_info.h from tty.h
screen_info.h doesn't have anything to do with the tty layer and shouldn't be
included by tty.h.  This patches removes the include and modifies all users to
directly include screen_info.h.  struct screen_info is mainly used to
communicate with the console drivers in drivers/video/console.  Note that this
patch touches every arch and I have no way of testing it.  If there is a
mistake the worst thing that will happen is a compile error.

[akpm@osdl.org: fix arm build]
[akpm@osdl.org: fix alpha build]
Signed-off-by: Jon Smirl <jonsmir@gmail.com>
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:16 -07:00
Arjan van de Ven 1454aed92b [PATCH] put a comment at register_die_notifier that the export is used
{un}register_die_notifier() is used by kdb... document this so that future
"remove dead export" rounds can skip this export.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:15 -07:00
Ingo Molnar f86bf9b7bc [PATCH] lockdep: clean up completion initializer in smpboot.c
Clean up lockdep on-stack-completion initializer.  (This also removes the
dependency on waitqueue_lock_key.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:14 -07:00
Andrew Morton 1a91023a9f [PATCH] x86_64: e820.c needs pgtable.h
arch/x86_64/kernel/e820.c:42: error: 'MAXMEM' undeclared here (not in a function)

Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:13 -07:00
Linus Torvalds 51bece910d Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild:
  kbuild: introduce utsrelease.h
  kbuild: explicit turn off gcc stack-protector
2006-07-03 21:26:12 -07:00
Ingo Molnar 60be6b9a41 [PATCH] lockdep: annotate on-stack completions
lockdep needs to have the waitqueue lock initialized for on-stack waitqueues
implicitly initialized by DECLARE_COMPLETION().  Annotate on-stack completions
accordingly.

Has no effect on non-lockdep kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:09 -07:00
Ingo Molnar 366c7f554e [PATCH] lockdep: annotate enable_in_hardirq()
Make use of local_irq_enable_in_hardirq() API to annotate places that enable
hardirqs in hardirq context.

Has no effect on non-lockdep kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:09 -07:00
Ingo Molnar 1e9505279a [PATCH] lockdep: enable on x86_64
Enable LOCKDEP_SUPPORT on x86_64.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:05 -07:00
Ingo Molnar 2148270cd2 [PATCH] lockdep: x86_64 early init
x86_64 uses spinlocks very early - earlier than start_kernel().  So call
lockdep_init() from the arch setup code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:04 -07:00
Ingo Molnar 6375e2b74c [PATCH] lockdep: irqtrace cleanup of include/asm-x86_64/irqflags.h
Clean up the x86-64 irqflags.h file:

 - macro => inline function transformation
 - simplifications
 - style fixes

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:03 -07:00
Ingo Molnar 2601e64d26 [PATCH] lockdep: irqtrace subsystem, x86_64 support
Add irqflags-tracing support to x86_64.

[akpm@osdl.org: build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:03 -07:00
Ingo Molnar 21b32bbff9 [PATCH] lockdep: stacktrace subsystem, x86_64 support
Framework to generate and save stacktraces quickly, without printing anything
to the console.  x86_64 support.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:02 -07:00
Ingo Molnar c9ca1ba5bd [PATCH] lockdep: x86_64 document stack frame internals
Document stack frame nesting internals some more.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:02 -07:00
Ingo Molnar 3ac94932a2 [PATCH] lockdep: beautify x86_64 stacktraces
Beautify x86_64 stacktraces to be more readable.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:02 -07:00
Sam Ravnborg 63104eec23 kbuild: introduce utsrelease.h
include/linux/version.h contained both actual KERNEL version
and UTS_RELEASE that contains a subset from git SHA1 for when
kernel was compiled as part of a git repository.
This had the unfortunate side-effect that all files including version.h
would be recompiled when some git changes was made due to changes SHA1.
Split it out so we keep independent parts in separate files.

Also update checkversion.pl script to no longer check for UTS_RELEASE.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2006-07-03 23:30:54 +02:00
Thomas Gleixner b1e05aa230 [PATCH] irq-flags: x86_64: Use the new IRQF_ constants
Use the new IRQF_ constants and remove the SA_INTERRUPT define

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-02 13:58:49 -07:00
Linus Torvalds fc25465f09 Merge branch 'audit.b22' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b22' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] audit syscall classes
  [PATCH] audit: support for object context filters
  [PATCH] audit: rename AUDIT_SE_* constants
  [PATCH] add rule filterkey
2006-07-01 09:59:08 -07:00
Heiko Carstens a581c2a469 [PATCH] add __[start|end]_rodata sections to asm-generic/sections.h
Add __start_rodata and __end_rodata to sections.h to avoid extern
declarations.  Needed by s390 code (see following patch).

[akpm@osdl.org: update architectures]
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-01 09:56:03 -07:00
Al Viro b915543b46 [PATCH] audit syscall classes
Allow to tie upper bits of syscall bitmap in audit rules to kernel-defined
sets of syscalls.  Infrastructure, a couple of classes (with 32bit counterparts
for biarch targets) and actual tie-in on i386, amd64 and ia64.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-07-01 07:44:10 -04:00
Jörn Engel 6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Adrian Bunk 80f7228b59 typo fixes: occuring -> occurring
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:27:16 +02:00
Adrian Bunk 5bba17127e [NET]: make skb_release_data() static
skb_release_data() no longer has any users in other files.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29 16:58:30 -07:00
Matt LaPlante 1f1332f727 [PATCH] KConfig: Spellchecking 'similarity' and 'independent'
Several KConfig files had 'similarity' and 'independent' spelled incorrectly...

Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 11:59:14 -07:00
Ingo Molnar c0ad90a32f [PATCH] genirq: add ->retrigger() irq op to consolidate hw_irq_resend()
Add ->retrigger() irq op to consolidate hw_irq_resend() implementations.
(Most architectures had it defined to NOP anyway.)

NOTE: ia64 needs testing. i386 and x86_64 tested.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:23 -07:00
Ingo Molnar a53da52fd7 [PATCH] genirq: cleanup: merge irq_affinity[] into irq_desc[]
Consolidation: remove the irq_affinity[NR_IRQS] array and move it into the
irq_desc[NR_IRQS].affinity field.

[akpm@osdl.org: sparc64 build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:22 -07:00
Ingo Molnar d1bef4ed5f [PATCH] genirq: rename desc->handler to desc->chip
This patch-queue improves the generic IRQ layer to be truly generic, by adding
various abstractions and features to it, without impacting existing
functionality.

While the queue can be best described as "fix and improve everything in the
generic IRQ layer that we could think of", and thus it consists of many
smaller features and lots of cleanups, the one feature that stands out most is
the new 'irq chip' abstraction.

The irq-chip abstraction is about describing and coding and IRQ controller
driver by mapping its raw hardware capabilities [and quirks, if needed] in a
straightforward way, without having to think about "IRQ flow"
(level/edge/etc.) type of details.

This stands in contrast with the current 'irq-type' model of genirq
architectures, which 'mixes' raw hardware capabilities with 'flow' details.
The patchset supports both types of irq controller designs at once, and
converts i386 and x86_64 to the new irq-chip design.

As a bonus side-effect of the irq-chip approach, chained interrupt controllers
(master/slave PIC constructs, etc.) are now supported by design as well.

The end result of this patchset intends to be simpler architecture-level code
and more consolidation between architectures.

We reused many bits of code and many concepts from Russell King's ARM IRQ
layer, the merging of which was one of the motivations for this patchset.

This patch:

rename desc->handler to desc->chip.

Originally i did not want to do this, because it's a big patch.  But having
both "desc->handler", "desc->handle_irq" and "action->handler" caused a
large degree of confusion and made the code appear alot less clean than it
truly is.

I have also attempted a dual approach as well by introducing a
desc->chip alias - but that just wasnt robust enough and broke
frequently.

So lets get over with this quickly.  The conversion was done automatically
via scripts and converts all the code in the kernel.

This renaming patch is the first one amongst the patches, so that the
remaining patches can stay flexible and can be merged and split up
without having some big monolithic patch act as a merge barrier.

[akpm@osdl.org: build fix]
[akpm@osdl.org: another build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:21 -07:00
Yasunori Goto cc57637b0b [PATCH] solve config broken: undefined reference to `online_page'
Memory hotplug code of i386 adds memory to only highmem.  So, if
CONFIG_HIGHMEM is not set, CONFIG_MEMORY_HOTPLUG shouldn't be set.
Otherwise, it causes compile error.

In addition, many architecture can't use memory hotplug feature yet.  So, I
introduce CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:20 -07:00
Andrew Morton d9a5685436 [PATCH] x86_64: oprofile build fix
WARNING: "unset_nmi_callback" [arch/x86_64/oprofile/oprofile.ko] undefined!
WARNING: "set_nmi_callback" [arch/x86_64/oprofile/oprofile.ko] undefined!

Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28 14:59:07 -07:00
Andrew Morton a052b68b1e [PATCH] x86: do_IRQ(): check irq number
We recently changed x86 to handle more than 256 IRQs.  Add a check in do_IRQ()
just to make sure that nothing went wrong with that implementation.

[chrisw@sous-sol.org: do x86_64 too]
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@muc.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28 14:59:02 -07:00
Siddha, Suresh B 5c45bf279d [PATCH] sched: mc/smt power savings sched policy
sysfs entries 'sched_mc_power_savings' and 'sched_smt_power_savings' in
/sys/devices/system/cpu/ control the MC/SMT power savings policy for the
scheduler.

Based on the values (1-enable, 0-disable) for these controls, sched groups
cpu power will be determined for different domains.  When power savings
policy is enabled and under light load conditions, scheduler will minimize
the physical packages/cpu cores carrying the load and thus conserving
power(with a perf impact based on the workload characteristics...  see OLS
2005 CMP kernel scheduler paper for more details..)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Con Kolivas <kernel@kolivas.org>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:45 -07:00
Chandra Seetharaman 74b85f3790 [PATCH] cpu hotplug: make cpu_notifier related notifier blocks __cpuinit only
Make notifier_blocks associated with cpu_notifier as __cpuinitdata.

__cpuinitdata makes sure that the data is init time only unless
CONFIG_HOTPLUG_CPU is defined.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:41 -07:00
Chandra Seetharaman 9c7b216d23 [PATCH] cpu hotplug: revert init patch submitted for 2.6.17
In 2.6.17, there was a problem with cpu_notifiers and XFS.  I provided a
band-aid solution to solve that problem.  In the process, i undid all the
changes you both were making to ensure that these notifiers were available
only at init time (unless CONFIG_HOTPLUG_CPU is defined).

We deferred the real fix to 2.6.18.  Here is a set of patches that fixes the
XFS problem cleanly and makes the cpu notifiers available only at init time
(unless CONFIG_HOTPLUG_CPU is defined).

If CONFIG_HOTPLUG_CPU is defined then cpu notifiers are available at run
time.

This patch reverts the notifier_call changes made in 2.6.17

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:40 -07:00
Randy Dunlap c9cf55285e [PATCH] add poison.h and patch primary users
Localize poison values into one header file for better documentation and
easier/quicker debugging and so that the same values won't be used for
multiple purposes.

Use these constants in core arch., mm, driver, and fs code.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:38 -07:00
Rusty Russell 19eadf98c8 [PATCH] x86: increase interrupt vector range
Remove the limit of 256 interrupt vectors by changing the value stored in
orig_{e,r}ax to be the complemented interrupt vector.  The orig_{e,r}ax
needs to be < 0 to allow the signal code to distinguish between return from
interrupt and return from syscall.  With this change applied, NR_IRQS can
be > 256.

Xen extends the IRQ numbering space to include room for dynamically
allocated virtual interrupts (in the range 256-511), which requires a more
permissive interface to do_IRQ.

Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Yasunori Goto bc02af93dd [PATCH] pgdat allocation for new node add (specify node id)
Change the name of old add_memory() to arch_add_memory.  And use node id to
get pgdat for the node at NODE_DATA().

Note: Powerpc's old add_memory() is defined as __devinit. However,
      add_memory() is usually called only after bootup.
      I suppose it may be redundant. But, I'm not well known about powerpc.
      So, I keep it. (But, __meminit is better at least.)

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:35 -07:00