Commit Graph

769 Commits (6760856791c6e527da678021ee6a67896549d4da)

Author SHA1 Message Date
Andi Kleen aecc63615e [PATCH] Fix coding style and output of the mptable parser
Give the printks a consistent prefix.
Add some missing white space.

Cc: len.brown@intel.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen e4251e130d [PATCH] Remove some cruft in apic id checking during processor setup
- Remove a define that was used only once
- Remove the too large APIC ID check because we always support
the full 8bit range of APICs.
- Restructure code a bit to be simpler.

Cc: len.brown@intel.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen f2c2cca3ac [PATCH] Remove APIC version/cpu capability mpparse checking/printing
ACPI went to great trouble to get the APIC version and CPU capabilities
of different CPUs before passing them to the mpparser. But all
that data was used was to print it out.  Actually it even faked some data
based on the boot cpu, not on the actual CPU being booted.

Remove all this code because it's not needed.

Cc: len.brown@intel.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen 151f8cc116 [PATCH] Remove safe_smp_processor_id()
And replace all users with ordinary smp_processor_id.  The function
was originally added to get some basic oops information out even
if the GS register was corrupted. However that didn't
work for some anymore because printk is needed to print the oops
and it uses smp_processor_id() already. Also GS register corruptions
are not particularly common anymore.

This also helps the Xen port which would otherwise need to
do this in a special way because it can't access the local APIC.

Cc: Chris Wright <chrisw@sous-sol.org>

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Rafael J. Wysocki 34464a5b89 [PATCH] Detect clock skew during suspend
Detect the situations in which the time after a resume from disk would
be earlier than the time before the suspend and prevent them from
happening on x86_64.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:37 +02:00
Andi Kleen dbf9272e86 [PATCH] Don't force reserve the 640k-1MB range
From i386 x86-64 inherited code to force reserve the 640k-1MB area.
That was needed on some old systems.

But we generally trust the e820 map to be correct on 64bit systems
and mark all areas that are not memory correctly.

This patch will allow to use the real memory in there.

Or rather the only way to find out if it's still needed is to
try. So far I'm optimistic.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:36 +02:00
Magnus Damm ed77504b20 [PATCH] mark init_amd() as __cpuinit
The init_amd() function is only called from identify_cpu() which is already
marked as __cpuinit. So let's mark it as __cpuinit.

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:36 +02:00
Andrew Morton abf0f10948 [PATCH] wire up oops_enter()/oops_exit()
Implement pause_on_oops() on x86_64.

AK: I redid the patch to do the oops_enter/exit in the existing
oops_begin()/end(). This makes it much shorter.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:36 +02:00
Arjan van de Ven e07e23e1fd [PATCH] non lazy "sleazy" fpu implementation
Right now the kernel on x86-64 has a 100% lazy fpu behavior: after *every*
context switch a trap is taken for the first FPU use to restore the FPU
context lazily.  This is of course great for applications that have very
sporadic or no FPU use (since then you avoid doing the expensive
save/restore all the time).  However for very frequent FPU users...  you
take an extra trap every context switch.

The patch below adds a simple heuristic to this code: After 5 consecutive
context switches of FPU use, the lazy behavior is disabled and the context
gets restored every context switch.  If the app indeed uses the FPU, the
trap is avoided.  (the chance of the 6th time slice using FPU after the
previous 5 having done so are quite high obviously).

After 256 switches, this is reset and lazy behavior is returned (until
there are 5 consecutive ones again).  The reason for this is to give apps
that do longer bursts of FPU use still the lazy behavior back after some
time.

[akpm@osdl.org: place new task_struct field next to jit_keyring to save space]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:36 +02:00
Eric W. Biederman ba4d40bb5c [PATCH] Auto size the per cpu area.
Now for a completely different but trivial approach.
I just boot tested it with 255 CPUS and everything worked.

Currently everything (except module data) we place in
the per cpu area we know about at compile time.  So
instead of allocating a fixed size for the per_cpu area
allocate the number of bytes we need plus a fixed constant
for to be used for modules.

It isn't perfect but it is much less of a pain to
work with than what we are doing now.

AK: fixed warning

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:35 +02:00
Andi Kleen 2717941c6a [PATCH] Make boot_param_data pure BSS
Since it's all zero.

Actually I think gcc 4+ will do that automatically, but earlier compilers won't
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:35 +02:00
Dimitri Sivanich cbf9b4bb76 [PATCH] X86_64 monotonic_clock goes backwards
I've noticed some erratic behavior while testing the X86_64 version
of monotonic_clock().

While spinning in a loop reading monotonic clock values (pinned to a
single cpu) I noticed that the difference between subsequent values
occasionally went negative (time going backwards).

I found that in the following code:
                this_offset = get_cycles_sync();
                /* FIXME: 1000 or 1000000? */
-->             offset = (this_offset - last_offset)*1000 / cpu_khz;
        }
        return base + offset;

the offset sometimes turns out to be 0, even though
this_offset > last_offset.

+Added fix From: Toyo Abe <toyoa@mvista.com>

The x86_64-mm-monotonic-clock.patch in 2.6.18-rc4-mm2 made a change to
the updating of monotonic_base. It now uses cycles_2_ns().

I suggest that a set_cyc2ns_scale() should be done prior to the setup_irq().
Because cycles_2_ns() can be called from the timer ISR right after the irq0
is enabled.

Signed-off-by: Toyo Abe <toyoa@mvista.com>
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Prasanna S.P d28c4393a7 [PATCH] x86: error_code is not safe for kprobes
This patch moves the entry.S:error_entry to .kprobes.text section,
since code marked unsafe for kprobes jumps directly to entry.S::error_entry,
that must be marked unsafe as well.
This patch also moves all the ".previous.text" asm directives to ".previous"
for kprobes section.

AK: Following a similar i386 patch from Chuck Ebbert
AK: Also merged Jeremy's fix in.

+From: Jeremy Fitzhardinge <jeremy@goop.org>

KPROBE_ENTRY does a .section .kprobes.text, and expects its users to
do a .previous at the end of the function.

Unfortunately, if any code within the function switches sections, for
example .fixup, then the .previous ends up putting all subsequent code
into .fixup.  Worse, any subsequent .fixup code gets intermingled with
the code its supposed to be fixing (which is also in .fixup).  It's
surprising this didn't cause more havok.

The fix is to use .pushsection/.popsection, so this stuff nests
properly.  A further cleanup would be to get rid of all
.section/.previous pairs, since they're inherently fragile.

+From: Chuck Ebbert <76306.1226@compuserve.com>

Because code marked unsafe for kprobes jumps directly to
entry.S::error_code, that must be marked unsafe as well.
The easiest way to do that is to move the page fault entry
point to just before error_code and let it inherit the same
section.

Also moved all the ".previous" asm directives for kprobes
sections to column 1 and removed ".text" from them.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Andi Kleen be7a91709b [PATCH] Check for end of stack trace before falling back
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Andi Kleen c0b766f13d [PATCH] Merge stacktrace and show_trace
This unifies the standard backtracer and the new stacktrace
in memory backtracer. The standard one is converted to use callbacks
and then reimplement stacktrace using new callbacks.

The main advantage is that stacktrace can now use the new dwarf2 unwinder
and avoid false positives in many cases.

I kept it simple to make sure the standard backtracer stays reliable.

Cc: mingo@elte.hu

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Andi Kleen b7f5e3c774 [PATCH] Don't access the APIC in safe_smp_processor_id when it is not mapped yet
Lockdep can call the dwarf2 unwinder early, and the dwarf2 code
uses safe_smp_processor_id which tries to access the local APIC page.
But that doesn't work before the APIC code has set up its fixmap.

Check for this case and always return boot cpu then.

Cc: jbeulich@novell.com
Cc: mingo@elte.hu

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Andi Kleen 5a1b3999d6 [PATCH] x86: Some preparationary cleanup for stack trace
- Remove unused all_contexts parameter
No caller used it
- Move skip argument into the structure (needed for
followon patches)

Cc: mingo@elte.hu

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Muli Ben-Yehuda 4ea8a5d8b5 [PATCH] Calgary IOMMU: eradicate sole remaining 80 chars per line offender
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Muli Ben-Yehuda 4ccf4ae314 [PATCH] remove tce_cache_blast_stress()
tce_cache_blast_stress was useful during bringup to stress the IOMMU's
cache flushing. Now that we quiesce DMAs on every cache flush, using
_stress() brings the machine down to its knees once you put it under
load. Remove this debug / bringup code that isn't useful anymore
completely.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Muli Ben-Yehuda 796e4390e0 [PATCH] only verify the allocation bitmap if CONFIG_IOMMU_DEBUG is on
Introduce new function verify_bit_range(). Define two versions, one
for CONFIG_IOMMU_DEBUG enabled and one for disabled. Previously we
were checking that the bitmap was consistent every time we allocated
or freed an entry in the TCE table, which is good for debugging but
incurs an unnecessary penalty on non debug builds.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Muli Ben-Yehuda de684652f3 [PATCH] print whether CONFIG_IOMMU_DEBUG is enabled
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Chuck Ebbert 2ade2920dc [PATCH] i386/x86-64: rename is_at_popf(), add iret to tests and fix
is_at_popf() needs to test for the iret instruction as well as
popf.  So add that test and rename it to is_setting_trap_flag().

Also change max insn length from 16 to 15 to match reality.

LAHF / SAHF can't affect TF, so the comment in x86_64 is removed.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Fernando Luis Vázquez Cao 2b94ab2fd5 [PATCH] Replace local_save_flags+local_irq_disable with
The combination of "local_save_flags" and "local_irq_disable" seems to be
equivalent to "local_irq_save" (see code snips below). Consequently, replace
occurrences of local_save_flags+local_irq_disable with local_irq_save.

* local_irq_save
#define raw_local_irq_save(flags) \
                do { (flags) = __raw_local_irq_save(); } while (0)

static inline unsigned long __raw_local_irq_save(void)
{
        unsigned long flags = __raw_local_save_flags();

        raw_local_irq_disable();

        return flags;
}

* local_save_flags
#define raw_local_save_flags(flags) \
                do { (flags) = __raw_local_save_flags(); } while (0)

Signed-off-by: Fernando Vazquez <fernando@intellilink.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen 131cfd7bd5 [PATCH] Add sparse annotation to vsyscall.c
Fixes

linux/arch/x86_64/kernel/vsyscall.c:276:7: warning: constant 0x0f40000000000 is so big it is long
linux/arch/x86_64/kernel/vsyscall.c:80:14: warning: incorrect type in argument 1 (different address spaces)
linux/arch/x86_64/kernel/vsyscall.c:80:14:    expected void const volatile [noderef] *addr<asn:2>
linux/arch/x86_64/kernel/vsyscall.c:80:14:    got void *<noident>
linux/arch/x86_64/kernel/vsyscall.c:200:7: warning: incorrect type in assignment (different address spaces)
linux/arch/x86_64/kernel/vsyscall.c:200:7:    expected unsigned short [usertype] *map1
linux/arch/x86_64/kernel/vsyscall.c:200:7:    got void [noderef] *<asn:2>
linux/arch/x86_64/kernel/vsyscall.c:203:7: warning: incorrect type in assignment (different address spaces)
linux/arch/x86_64/kernel/vsyscall.c:203:7:    expected unsigned short [usertype] *map2
linux/arch/x86_64/kernel/vsyscall.c:203:7:    got void [noderef] *<asn:2>
linux/arch/x86_64/kernel/vsyscall.c:215:10: warning: incorrect type in argument 1 (different address spaces)
linux/arch/x86_64/kernel/vsyscall.c:215:10:    expected void volatile [noderef] *addr<asn:2>
linux/arch/x86_64/kernel/vsyscall.c:215:10:    got unsigned short [usertype] *map2
linux/arch/x86_64/kernel/vsyscall.c:217:10: warning: incorrect type in argument 1 (different address spaces)
linux/arch/x86_64/kernel/vsyscall.c:217:10:    expected void volatile [noderef] *addr<asn:2>
linux/arch/x86_64/kernel/vsyscall.c:217:10:    got unsigned short [usertype] *map1

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen 3bd4d18cba [PATCH] Move e820 map into e820.c
Minor cleanup. Keep setup.c free from unrelated clutter.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen c31fbb1ad8 [PATCH] Clean up acpi_numa variable
Move it into srat.c No need to clutter up setup.c for it

And remove use in setup.c completely - it only guarded a printk
which can be done unconditionally.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen df3bb57d2c [PATCH] i386/x86-64: Move acpi_disabled variables into acpi/boot.c
Removes code duplication between i386/x86-64.

Not needed anymore in setup.c since early_param cleanup

Cc: len.brown@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:33 +02:00
Andi Kleen 43c85c9c5d [PATCH] Remove need for early lockdep init
I think it was only needed for the printks and we can do them later.

I put in a single early_printk so that we know the kernel is alive
(early_printk doesn't need any locks)

This makes some things easier for initialization of unwind for
lockdep, which is needed by later patches.

cc: mingo@elte.hu

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Andi Kleen 2c8c0e6b8d [PATCH] Convert x86-64 to early param
Instead of hackish manual parsing

Requires earlier i386 patchkit, but also fixes i386 early_printk again.

I removed some obsolete really early parameters which didn't do anything useful.
Also made a few parameters that needed it early (mostly oops printing setup)

Also removed one panic check that wasn't visible without
early console anyways (the early console is now initialized after that
panic)

This cleans up a lot of code.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Andi Kleen 9ca33eb698 [PATCH] Use early CPU identify before early command line parsing
This makes it possible to modify CPU flags in command line
options without hacks.

And remove another copy in head64.c

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Chuck Ebbert d4d35854a1 [PATCH] remove lock prefix from is_at_popf() tests
The lock prefix will cause an exception when used with the
popf instruction, so no need to continue searching after it's
found.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Muli Ben-Yehuda 145106e810 [PATCH] remove superflous BUG_ON's in nommu and gart
There's no need to check for invalid DMA data direction in nommu and
gart since we do it in dma-mapping.h anyway before calling the
individual dma-ops.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Eric W. Biederman 29a6c25bd6 [PATCH] Fix gdt table size in trampoline.S
Allows easier extension of the GDT by using the proper C symbol
for the size in the descriptor.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Chuck Ebbert a752d7194c [PATCH] fix is_at_popf() for compat tasks
When testing for the REX instruction prefix, first check
for 32-bit mode because in compat mode the REX prefix is an
increment instruction.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Muli Ben-Yehuda 0577f148b5 [PATCH] Calgary IOMMU: save a bit of space in bus_info
Make translation_disabled a uchar rather than an int

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda a4fc520a0f [PATCH] Calgary IOMMU: calgary_init_one_nontraslated() can return void
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda 871b17008e [PATCH] Calgary IOMMU: fix reference counting of Calgary PCI devices
The pci_get_device() API decrements the reference count on the 'from'
parameter when it continues searching. Therefore, take a ref count on
Calgary bus when we initialize them in either translated or
non-translated mode.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda b8f4fe66a5 [PATCH] Calgary IOMMU: fix error path memleak in calgary_free_tar
We were freeing the iommu_table and leaking the bitmap pages. Also
rename it to calgary_free_bus, which is more accurate.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda 9f2dc46d5e [PATCH] Calgary IOMMU: break out of pci_find_device_reverse if dev not found
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda f38db651d5 [PATCH] Calgary IOMMU: consolidate per bus data structures
Move the tce_table_kva array, disabled bitmap and bus_to_phb array
into a new per bus 'struct calgary_bus_info'. Also slightly reorganize
build_tce_table and tce_table_setparms to avoid exporting bus_info to
tce.c.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Jan Beulich 3b94355c47 [PATCH] remove int_delivery_dest
The genapic field and the accessor macro weren't used anywhere.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Jan Beulich caff0710eb [PATCH] initialize end of memory variables as early as possible
While an earlier patch already did a small step into that direction,
this patch moves initialization of all memory end variables to as
early as possible, so that dependent code doesn't need to check
whether these variables have already been set.

Also, remove a misleading (perhaps just outdated) comment, and make
static a variable only used in a single file.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Andi Kleen 44cc45267b [PATCH] Remove obsolete CVS $Id$ from assembler files in arch/x86_64/kernel/*
CVS hasn't been used for a long time for them.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Andi Kleen fe7414a288 [PATCH] Use BUILD_BUG_ON in apic.c build sanity checking
Makes code a little shorter.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen efec3b9a32 [PATCH] Fix up some non linuxy style in ACPI functions in mpparse.c
No functional changes.

Cc: len.brown@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen 276ec1a76a [PATCH] Remove some unneeded ACPI externs in mpparse.c
They are not used in this file so remove them. i386 didn't have them either.

Cc: len.brown@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen a01fd3baff [PATCH] Remove useless wrapper in mpparse.c code
It used to contain support code for NUMAQ, but that is long gone already
on 64bit.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen 55f05ffaa7 [PATCH] Replace mp bus array with bitmap for bus not pci
Since we only support PCI and ISA legacy busses now there is no need to
have an full array with checking.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen dfa4698c50 [PATCH] Move early chipset quirks out to new file
They did not really belong into io_apic.c. Move them into a new file
and clean it up a bit.

Also remove outdated ATI quirk that was obsolete,

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen edd9652296 [PATCH] Remove MPS table APIC renumbering
The MPS table specification says that the operating system should
renumber the IO-APICs following the table as needed.  However in
ACPI this is not allowed or neeeded and all x86-64 systems are ACPI
compliant.

The code was already disabled on some systems because it caused
problems there. Remove it completely now.

CC: mdomsch@dell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen eea0e11c1f [PATCH] Factor out common io apic routing entry access
The IO APIC code had lots of duplicated code to read/write 64bit
routing entries into the IO-APIC. Factor this out int common read/write
functions

In a few cases the IO APIC lock is taken more often now, but this
isn't a problem because it's all initialization/shutdown only
slow path code.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen c1a58b42b4 [PATCH] i386/x86-64: Remove obsolete sanity check in mptable parsing
It apparently has never triggered in many years.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen a8fcf1a24a [PATCH] Remove obsolete PIC mode
PIC mode is an outdated way to drive the APICs that was used on
some early MP boards. It is not supported in the ACPI model.

It is unlikely to be ever configured by any x86-64 system

Remove it thus.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen e509913434 [PATCH] Remove leftover MCE/EISA support
No 64bit EISA or Microchannel systems ever. Remove the left over code
in the IO-APIC driver and the mptable parser

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 5cb6b99928 [PATCH] Remove pirq overwrite support
This was an old workaround for broken MP-BIOS. The user could
specify overwrites on the command line.

I've never seen it being used for anything on 64bit. So get
rid of it for now.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 2e91a17b35 [PATCH] Add some comments to entry.S
And remove some old obsolete ones.
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 3f14c746a6 [PATCH] Remove old "focus disabled" chipset errata workaround
The new systems already use focus disabled and the comment was
completely outdated.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 6c96a29f20 [PATCH] Remove apic mismatch counter
Nobody has been setting the mismatch counter and the ifdef was never
set so remove it.
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 7f11d8a5ef [PATCH] Remove all ifdefs for local/io apic
IO-APIC or local APIC can only be disabled at runtime anyways and
Kconfig has forced these options on for a long time now.

The Kconfigs are kept only now for the benefit of the shared acpi
boot.c code.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 5ba5891d44 [PATCH] Add some comments what tce.c actually does
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen b06babac45 [PATCH] Add proper alignment to ENTRY
Previously it didn't align. Use the same one as the C compiler
in blended mode, which is good for K8 and Core2 and doesn't hurt
on P4.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen 31679f38d8 [PATCH] Simplify profile_pc on x86-64
Use knowledge about EFLAGS layout (bits 22:63 are always 0) to distingush
EFLAGS word and kernel address in the spin lock stack frame.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Andi Kleen c16b63e09d [PATCH] i386/x86-64: Don't randomize stack top when no randomization personality is set
Based on patch from Frank van Maarseveen <frankvm@frankvm.com>, but
extended.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Adam Henley d5d9ca6d88 [PATCH] A few trivial spelling and grammar fixes
A few trivial spelling and grammar mistakes picked up in
"arch/x86_64/aperture.c", "arch/x86_64/crash.c" and
"arch/x86_64/apic.c". I think all are correct fixes but am ever aware
of my fallibility :o) This is my first patch submission so all
feedback is appreciated, esp. WRT CCing to Linus, Andi and
trivial@kernel.org, is this correct? And which is the most appropriate
kernel version to diff against? If any.

Should apply cleanly to 2.6.18-rc1

Signed-off-by: Adam Henley <adamazing@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>

-  adam
2006-09-26 10:52:28 +02:00
Stephane Eranian d3a4f48d48 [PATCH] x86-64 TIF flags for debug regs and io bitmap in ctxsw
Hello,

Following my discussion with Andi. Here is a patch that introduces
two new TIF flags to simplify the context switch code in __switch_to().
The idea is to minimize the number of cache lines accessed in the common
case, i.e., when neither the debug registers nor the I/O bitmap are used.

This patch covers the x86-64 modifications. A patch for i386 follows.

Changelog:
	- add TIF_DEBUG to track when debug registers are active
	- add TIF_IO_BITMAP to track when I/O bitmap is used
	- modify __switch_to() to use the new TIF flags

<signed-off-by>: eranian@hpl.hp.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Andi Kleen 2f766d1606 [PATCH] Clean up asm/smp.h includes
No need to include it from entry.S
Drop all the #ifdef __ASSEMBLY__

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Vojtech Pavlik c08c820508 [PATCH] Add the vgetcpu vsyscall
This patch adds a vgetcpu vsyscall, which depending on the CPU RDTSCP
capability uses either the RDTSCP or CPUID to obtain a CPU and node
numbers and pass them to the program.

AK: Lots of changes over Vojtech's original code:
Better prototype for vgetcpu()
It's better to pass the cpu / node numbers as separate arguments
to avoid mistakes when going from SMP to NUMA.
Also add a fast time stamp based cache using a user supplied
argument to speed things more up.
Use fast method from Chuck Ebbert to retrieve node/cpu from
GDT limit instead of CPUID
Made sure RDTSCP init is always executed after node is known.
Drop printk

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Vojtech Pavlik a670fad0ad [PATCH] Add initalization of the RDTSCP auxilliary values
This patch adds initalization of the RDTSCP auxilliary values to CPU numbers
to time.c. If RDTSCP is available, the MSRs are written with the respective
values. It can be later used to initalize per-cpu timekeeping variables.

AK: Some cleanups. Move externs into headers and fix CPU hotplug.

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Venkatesh Pallipadi 248dcb2fff [PATCH] x86: i386/x86-64 Add nmi watchdog support for new Intel CPUs
AK: This redoes the changes I temporarily reverted.

Intel now has support for Architectural Performance Monitoring Counters
( Refer to IA-32 Intel Architecture Software Developer's Manual
http://www.intel.com/design/pentium4/manuals/253669.htm ). This
feature is present starting from Intel Core Duo and Intel Core Solo processors.

What this means is, the performance monitoring counters and some performance
monitoring events are now defined in an architectural way (using cpuid).
And there will be no need to check for family/model etc for these architectural
events.

Below is the patch to use this performance counters in nmi watchdog driver.
Patch handles both i386 and x86-64 kernels.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Vivek Goyal 3f22c5789e [PATCH] kdump x86_64 nmi event notification fix
After a crash we should wait for NMI IPI event and not for external NMI or
NMI watchdog tick.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Andi Kleen fac58550e8 [PATCH] Fix up panic messages for different NMI panics
When a unknown NMI happened the panic would claim a NMI watchdog timeout.
Also it would check the variable set by nmi_watchdog=panic and panic then.

Fix up the panic message to be generic
Unconditionally panic on unknown NMI when panic on unknown nmi is enabled.

Noticed by Jan Beulich

Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Shaohua Li 4038f901cf [PATCH] i386/x86-64: Fix NMI watchdog suspend/resume
Making NMI suspend/resume work with SMP. We use CPU hotplug to offline
APs in SMP suspend/resume. Only BSP executes sysdev's .suspend/.resume
method. APs should follow CPU hotplug code path.

And:

+From: Don Zickus <dzickus@redhat.com>

Makes the start/stop paths of nmi watchdog more robust to handle the
suspend/resume cases more gracefully.

AK: I merged the two patches together

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Don Zickus c41c5cd3b2 [PATCH] x86: x86 clean up nmi panic messages
Clean up some of the output messages on the nmi error paths to make more
sense when they are displayed.  This is mainly a cosmetic fix and
shouldn't impact any normal code path.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus 8da5adda91 [PATCH] x86: Allow users to force a panic on NMI
To quote Alan Cox:

The default Linux behaviour on an NMI of either memory or unknown is to
continue operation. For many environments such as scientific computing
it is preferable that the box is taken out and the error dealt with than
an uncorrected parity/ECC error get propogated.

A small number of systems do generate NMI's for bizarre random reasons
such as power management so the default is unchanged. In other respects
the new proc/sys entry works like the existing panic controls already in
that directory.

This is separate to the edac support - EDAC allows supported chipsets to
handle ECC errors well, this change allows unsupported cases to at least
panic rather than cause problems further down the line.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus e33e89ab1a [PATCH] x86: Add abilty to enable/disable nmi watchdog from procfs (update)
Adds a new /proc/sys/kernel/nmi_watchdog call that will enable/disable the
nmi watchdog.

By entering a non-zero value here, a user can enable the nmi watchdog to
monitor the online cpus in the system.  By entering a zero value here, a
user can disable the nmi watchdog and free up a performance counter which
could then be utilized by the oprofile subsystem, otherwise oprofile may be
short a counter when in use.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Don Zickus 407984f1af [PATCH] x86: Add abilty to enable/disable nmi watchdog with sysctl
Adds a new /proc/sys/kernel/nmi call that will enable/disable the nmi
watchdog.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus 2fbe7b25c8 [PATCH] i386/x86-64: Remove un/set_nmi_callback and reserve/release_lapic_nmi functions
Removes the un/set_nmi_callback and reserve/release_lapic_nmi functions as
they are no longer needed.  The various subsystems are modified to register
with the die_notifier instead.

Also includes compile fixes by Andrew Morton.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Andi Kleen 1d001df19d [PATCH] Add TIF_RESTORE_SIGMASK
We need TIF_RESTORE_SIGMASK in order to support ppoll() and pselect()
system calls. This patch originally came from Andi, and was based
heavily on David Howells' implementation of same on i386. I fixed a typo
which was causing do_signal() to use the wrong signal mask.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus 3adbbcce9a [PATCH] x86: Cleanup NMI interrupt path
This patch cleans up the NMI interrupt path.  Instead of being gated by if
the 'nmi callback' is set, the interrupt handler now calls everyone who is
registered on the die_chain and additionally checks the nmi watchdog,
reseting it if enabled.  This allows more subsystems to hook into the NMI if
they need to (without being block by set_nmi_callback).

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus f2802e7f57 [PATCH] Add SMP support on x86_64 to reservation framework
This patch includes the changes to make the nmi watchdog on x86_64 SMP
aware.  A bunch of code was moved around to make it simpler to read.  In
addition, it is now possible to determine if a particular NMI was the result
of the watchdog or not.  This feature allows the kernel to filter out
unknown NMIs easier.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus 828f0afda1 [PATCH] x86: Add performance counter reservation framework for UP kernels
Adds basic infrastructure to allow subsystems to reserve performance
counters on the x86 chips.  Only UP kernels are supported in this patch to
make reviewing easier.  The SMP portion makes a lot more changes.

Think of this as a locking mechanism where each bit represents a different
counter.  In addition, each subsystem should also reserve an appropriate
event selection register that will correspond to the performance counter it
will be using (this is mainly neccessary for the Pentium 4 chips as they
break the 1:1 relationship to performance counters).

This will help prevent subsystems like oprofile from interfering with the
nmi watchdog.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Andi Kleen b07f8915cd [PATCH] x86: Temporarily revert parts of the Core 2 nmi nmi watchdog support
This makes merging easier.  They are readded a few patches later.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Linus Torvalds 79e453d49b Revert mmiocfg heuristics and blacklist changes
This reverts commits 11012d419c and
40dd2d20f2, which allowed us to use the
MMIO accesses for PCI config cycles even without the area being marked
reserved in the e820 memory tables.

Those changes were needed for EFI-environment Intel macs, but broke some
newer Intel 965 boards, so for now it's better to revert to our old
2.6.17 behaviour and at least avoid introducing any new breakage.

Andi Kleen has a set of patches that work with both EFI and the broken
Intel 965 boards, which will be applied once they get wider testing.

Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-19 08:15:22 -07:00
Al Viro 55669bfa14 [PATCH] audit: AUDIT_PERM support
add support for AUDIT_PERM predicate

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-09-11 13:32:30 -04:00
Al Viro dc104fb323 [PATCH] audit: more syscall classes added
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-09-11 13:32:27 -04:00
Keith Owens 01ebb77b31 [PATCH] x86_64: Save original IST values for checking stack addresses
The values in init_tss.ist[] can change when an IST event occurs.  Save
the original IST values for checking stack addresses when debugging or
doing stack traces.

Signed-off-by: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:16 -07:00
Andi Kleen ceee882230 [PATCH] x86_64: Recover 1MB of kernel memory
Noticed by Jan Beulich.

When the kernel was moved from 1MB to 2MB in 2.6.17 the kernel reservation
code wasn't adjusted and it still reserved starting with 1MB. This means 1MB always
were lost.

This patch fixes this by reserving only starting with _text.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Jan Beulich ea424055b7 [PATCH] x86: Make backtracer fallback logic more bullet-proof
The unwinder fallback logic still had potential for falling through to
the legacy stack trace code without printing an indication (at once
serving as a separator) of this.

Further, the stack pointer retrieval for the fallback should be as
restrictive as possible (in order to avoid having the legacy stack
tracer try to access invalid memory). The patch tightens that, but
this could certainly be further improved.

Also making the call_trace command line option now conditional upon
CONFIG_STACK_UNWIND (as it's meaningless otherwise).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Andi Kleen c05991ed12 [PATCH] x86_64: Add kernel thread stack frame termination for properly stopping stack unwinds.
One open question: Should these added pushes perhaps be made
conditional upon CONFIG_STACK_UNWIND or CONFIG_UNWIND_INFO?
[AK: Not needed -- these are all very slow paths]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Andi Kleen 11012d419c [PATCH] x86: Revert e820 MCFG heuristics
The check for the MCFG table being reserved in the e820 map was originally
added to detect a broken BIOS in a preproduction Intel SDV. However it also
breaks the Apple x86 Macs, which can't supply this properly, but need
a working MCFG. With this patch they wouldn't use the MCFG and not work.

After some discussion I think it's best to remove the heuristic again.
It also failed on some other boxes (although it didn't cause much
problems there because old style port access for PCI config space
still works as fallback), but the preproduction SDVs can just use
pci=nommcfg. Supporting production machines properly is more
important.

Edgar Hucek did all the debugging work.

Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Horms 012c437d03 [PATCH] Change panic_on_oops message to "Fatal exception"
Previously the message was "Fatal exception: panic_on_oops", as introduced
in a recent patch whith removed a somewhat dangerous call to ssleep() in
the panic_on_oops path.  However, Paul Mackerras suggested that this was
somewhat confusing, leadind people to believe that it was panic_on_oops
that was the root cause of the fatal exception.  On his suggestion, this
patch changes the message to simply "Fatal exception".  A suitable oops
message should already have been displayed.

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-14 12:54:29 -07:00
Alexey Dobriyan 825e037fb8 [PATCH] Fix more per-cpu typos
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06 08:57:47 -07:00
Muli Ben-Yehuda a166222cde [PATCH] x86_64: Fix CONFIG_IOMMU_DEBUG
If CONFIG_IOMMU_DEBUG is set force_iommu defaults to 1. In the case
where no HW IOMMU is present in the machine and we end up using nommu,
leaving force_iommu set to 1 causes dma_alloc_coherent to do the wrong
thing. Therefore, if we end up using nommu, make sure force_iommu is
0.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-02 20:19:54 -07:00
Andi Kleen 2699500b31 [PATCH] x86_64: Fix backtracing for interrupt stacks
Re-add backlink for old style unwinder to stack switching.  Add proper
stack frame and CFI annotations to call_softirq

This prevents a oops when backtracing with fallback through the
interrupt stack top.

Suggested by Jan Beulich and Herbert Xu wanted it in 2.6.18.

Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-02 20:19:54 -07:00
Chandra Seetharaman be6b5a3505 [PATCH] cpu hotplug: use hotplug version of registration in late inits
Use hotplug version of register_cpu_notifier in late init functions.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:39 -07:00
Horms cea6a4ba8a [PATCH] panic_on_oops: remove ssleep()
This patch is part of an effort to unify the panic_on_oops behaviour across
all architectures that implement it.

It was pointed out to me by Andi Kleen that if an oops has occured in
interrupt context, then calling sleep() in the oops path will only cause a
panic, and that it would be really better for it not to be in the path at
all.

This patch removes the ssleep() call and reworks the console message
accordinly.  I have a slght concern that the resulting console message is
too long, feedback welcome.

For powerpc it also unifies the 32bit and 64bit behaviour.

Fror x86_64, this patch only updates the console message, as ssleep() is
already not present.

Signed-off-by: Horms <horms@verge.net.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:39 -07:00
Eric W. Biederman 2a8a3d5b65 [PATCH] machine_kexec.c: Fix the description of segment handling
One of my original comments in machine_kexec was unclear
and this should fix it.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Horms <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:38 -07:00
Andi Kleen 65f87d8a8a [PATCH] x86_64: Fix swiotlb=force
It was broken before. But having it is important as possible hardware
bug workaround.

And previously there was no way to force swiotlb if there is another IOMMU.
Side effect is that iommu=force won't force swiotlb anymore even if there
isn't another IOMMU.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Jon Mason d2105b10fe [PATCH] x86_64: Calgary IOMMU - Multi-Node NULL pointer dereference fix
Calgary hits a NULL pointer dereference when booting in a multi-chassis
NUMA system.  See Redhat bugzilla number 198498, found by Konrad
Rzeszutek (konradr@redhat.com).

There are many issues that had to be resolved to fix this problem.
Firstly when I originally wrote the code to handle NUMA systems, I
had a large misunderstanding that was not corrected until now.  That was
that I thought the "number of nodes online" referred to number of
physical systems connected.  So that if NUMA was disabled, there
would only be 1 node and it would only show that node's PCI bus.
In reality if NUMA is disabled, the system displays all of the
connected chassis as one node but is only ignorant of the delays
in accessing main memory.  Therefore, references to num_online_nodes()
and MAX_NUMNODES are incorrect and need to be set to the maximum
number of nodes that can be accessed (which are 8).  I created a
variable, MAX_NUM_CHASSIS, and set it to 8 to fix this.

Secondly, when walking the PCI in detect_calgary, the code only
checked the first "slot" when looking to see if a device is present.
This will work for most cases, but unfortunately it isn't always the
case.  In the NUMA MXE drawers, there are USB devices present on the
3rd slot (with slot 1 being empty).  So, to work around this, all
slots (up to 8) are scanned to see if there are any devices present.

Lastly, the bus is being enumerated on large systems in a different
way the we originally thought.  This throws the ugly logic we had
out the window.  To more elegantly handle this, I reorganized the
kva array to be sparse (which removed the need to have any bus number
to kva slot logic in tce.c) and created a secondary space array to
contain the bus number to phb mapping.

With these changes Calgary boots on an x460 with 4 nodes with and
without NUMA enabled.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Muli Ben-Yehuda 089bbbcb36 [PATCH] x86_64: Calgary IOMMU - fix off by one error
Fixed off-by-one error in detect_calgary and calgary_init which will
cause arrays to overflow.  Also, removed impossible to hit BUG_ON.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00