Commit Graph

198 Commits (d437441ef5336874e934bd53a03159a584efe95a)

Author SHA1 Message Date
Jakub Jelinek 4732efbeb9 [PATCH] FUTEX_WAKE_OP: pthread_cond_signal() speedup
ATM pthread_cond_signal is unnecessarily slow, because it wakes one waiter
(which at least on UP usually means an immediate context switch to one of
the waiter threads).  This waiter wakes up and after a few instructions it
attempts to acquire the cv internal lock, but that lock is still held by
the thread calling pthread_cond_signal.  So it goes to sleep and eventually
the signalling thread is scheduled in, unlocks the internal lock and wakes
the waiter again.

Now, before 2003-09-21 NPTL was using FUTEX_REQUEUE in pthread_cond_signal
to avoid this performance issue, but it was removed when locks were
redesigned to the 3 state scheme (unlocked, locked uncontended, locked
contended).

Following scenario shows why simply using FUTEX_REQUEUE in
pthread_cond_signal together with using lll_mutex_unlock_force in place of
lll_mutex_unlock is not enough and probably why it has been disabled at
that time:

The number is value in cv->__data.__lock.
        thr1            thr2            thr3
0       pthread_cond_wait
1       lll_mutex_lock (cv->__data.__lock)
0       lll_mutex_unlock (cv->__data.__lock)
0       lll_futex_wait (&cv->__data.__futex, futexval)
0                       pthread_cond_signal
1                       lll_mutex_lock (cv->__data.__lock)
1                                       pthread_cond_signal
2                                       lll_mutex_lock (cv->__data.__lock)
2                                         lll_futex_wait (&cv->__data.__lock, 2)
2                       lll_futex_requeue (&cv->__data.__futex, 0, 1, &cv->__data.__lock)
                          # FUTEX_REQUEUE, not FUTEX_CMP_REQUEUE
2                       lll_mutex_unlock_force (cv->__data.__lock)
0                         cv->__data.__lock = 0
0                         lll_futex_wake (&cv->__data.__lock, 1)
1       lll_mutex_lock (cv->__data.__lock)
0       lll_mutex_unlock (cv->__data.__lock)
          # Here, lll_mutex_unlock doesn't know there are threads waiting
          # on the internal cv's lock

Now, I believe it is possible to use FUTEX_REQUEUE in pthread_cond_signal,
but it will cost us not one, but 2 extra syscalls and, what's worse, one of
these extra syscalls will be done for every single waiting loop in
pthread_cond_*wait.

We would need to use lll_mutex_unlock_force in pthread_cond_signal after
requeue and lll_mutex_cond_lock in pthread_cond_*wait after lll_futex_wait.

Another alternative is to do the unlocking pthread_cond_signal needs to do
(the lock can't be unlocked before lll_futex_wake, as that is racy) in the
kernel.

I have implemented both variants, futex-requeue-glibc.patch is the first
one and futex-wake_op{,-glibc}.patch is the unlocking inside of the kernel.
 The kernel interface allows userland to specify how exactly an unlocking
operation should look like (some atomic arithmetic operation with optional
constant argument and comparison of the previous futex value with another
constant).

It has been implemented just for ppc*, x86_64 and i?86, for other
architectures I'm including just a stub header which can be used as a
starting point by maintainers to write support for their arches and ATM
will just return -ENOSYS for FUTEX_WAKE_OP.  The requeue patch has been
(lightly) tested just on x86_64, the wake_op patch on ppc64 kernel running
32-bit and 64-bit NPTL and x86_64 kernel running 32-bit and 64-bit NPTL.

With the following benchmark on UP x86-64 I get:

for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so --library-path .:$i /tmp/bench; \
for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 2>&1; done; done
time elf/ld.so --library-path .:nptl-orig /tmp/bench
real 0m0.655s user 0m0.253s sys 0m0.403s
real 0m0.657s user 0m0.269s sys 0m0.388s
time elf/ld.so --library-path .:nptl-requeue /tmp/bench
real 0m0.496s user 0m0.225s sys 0m0.271s
real 0m0.531s user 0m0.242s sys 0m0.288s
time elf/ld.so --library-path .:nptl-wake_op /tmp/bench
real 0m0.380s user 0m0.176s sys 0m0.204s
real 0m0.382s user 0m0.175s sys 0m0.207s

The benchmark is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00001.txt
Older futex-requeue-glibc.patch version is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00002.txt
Older futex-wake_op-glibc.patch version is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt00003.txt
Will post a new version (just x86-64 fixes so that the patch
applies against pthread_cond_signal.S) to libc-hacker ml soon.

Attached is the kernel FUTEX_WAKE_OP patch as well as a simple-minded
testcase that will not test the atomicity of the operation, but at least
check if the threads that should have been woken up are woken up and
whether the arithmetic operation in the kernel gave the expected results.

Acked-by: Ingo Molnar <mingo@redhat.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Jamie Lokier <jamie@shareable.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:17 -07:00
Ashok Raj 54d5d42404 [PATCH] x86/x86_64: deferred handling of writes to /proc/irqxx/smp_affinity
When handling writes to /proc/irq, current code is re-programming rte
entries directly. This is not recommended and could potentially cause
chipset's to lockup, or cause missing interrupts.

CONFIG_IRQ_BALANCE does this correctly, where it re-programs only when the
interrupt is pending. The same needs to be done for /proc/irq handling as well.
Otherwise user space irq balancers are really not doing the right thing.

- Changed pending_irq_balance_cpumask to pending_irq_migrate_cpumask for
  lack of a generic name.
- added move_irq out of IRQ_BALANCE, and added this same to X86_64
- Added new proc handler for write, so we can do deferred write at irq
  handling time.
- Display of /proc/irq/XX/smp_affinity used to display CPU_MASKALL, instead
  it now shows only active cpu masks, or exactly what was set.
- Provided a common move_irq implementation, instead of duplicating
  when using generic irq framework.

Tested on i386/x86_64 and ia64 with CONFIG_PCI_MSI turned on and off.
Tested UP builds as well.

MSI testing: tbd: I have cards, need to look for a x-over cable, although I
did test an earlier version of this patch.  Will test in a couple days.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Zwane Mwaikambo <zwane@holomorphy.com>
Grudgingly-acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:15 -07:00
Mark Maule 5fbcf9a5c6 [IA64-SGI] volatile semantics in places where it seems necessary
Resend using accessors instead of volatile qualifiers per hch comments, and
easier to understand convenience macros per rja comments.

Patch to apply volatile semantics when accessing MMR's in various SN files.

Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-07 16:23:41 -07:00
Kenji Kaneshige 9799e4d39a [IA64] Minor cleanups - remove unnecessary function prototype in iosapic.h
The function prototypes for iosapic_enable_intr() and
iosapic_pci_fixup() in include/asm-ia64/iosapic.h are no longer
needed. This patch removes them. The original patch has been posted by
Satoru Takeuchi.

Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-07 14:00:40 -07:00
Kenji Kaneshige 697eaad417 [IA64] Minor cleanups - remove CONFIG_ACPI_DEALLOCATE_IRQ
The config option 'CONFIG_ACPI_DEALLOCATE_IRQ' is no longer
needed. This patch removes it.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-07 14:00:08 -07:00
Kenji Kaneshige a52ac87eb2 [IA64] Minor cleanups - remove unnecessary function prototype in irq.h
The function prototype for handl_IRQ_event() in include/asm-ia64/irq.h
is no longer needed. This patch removes it.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-07 13:59:40 -07:00
Dean Nelson a607c38971 [IA64-SGI] get XPC to cleanly disengage from remote memory references
When XPC is being shutdown (i.e., rmmod, reboot) it doesn't ensure that
other partitions with whom it was connected have completely disengaged
from any attempt at cross-partition memory references. This can lead to
MCAs in any of these other partitions when the partition is reset.

Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-06 16:15:38 -07:00
Bruce Losure 25732ad493 [IA64] Altix patch for fpga reset
1) workaround a h/w reset issue
2) to improve the determination of FPGA-based h/w in
   the arch/ia64/sn/kernel/tiocx code.

Signed-off-by: Bruce Losure <blosure@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-06 14:16:01 -07:00
Kyle Moffett fa5b08d5f8 [PATCH] sab: consolidate kmem_bufctl_t
This is used only in slab.c and each architecture gets to define whcih
underlying type is to be used.

Seems a bit silly - move it to slab.c and use the same type for all
architectures: unsigned int.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:05:48 -07:00
Len Brown 129521dcc9 Merge linux-2.6 into linux-acpi-2.6 test 2005-09-03 02:44:09 -04:00
Jack Steiner a1cddb8892 [IA64-SGI] Add new vendor-specific SAL calls for:
- notifying the PROM of specific features that are supported by the OS.
  This is used to enable PROM feature if and only if the corresponding
  feature is implemented in the OS

- fetch feature sets that are supported by the current PROM. This allows
  the OS to selectively enable features when the PROM support is available.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-31 11:00:53 -07:00
Tony Luck ff67b59726 [IA64] Low byte of current->personality is not a bitmask.
Peter Staubach pointed out that it is not correct to check
current->personality & PER_LINUX32 (this will have false
hits on several other personality values).

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-30 14:59:24 -07:00
Tony Luck 288ceb8f14 Auto-update from upstream 2005-08-30 09:30:09 -07:00
Tony Luck e438befd76 [IA64-SGI] One new use of "UNCACHED" needed fixing for sn2 region cleanup
Some shub2 changes were not in the tree when Greg cleaned up the sn2
region definitions in 1b66776da7, so this
one didn't get fixed.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-29 16:13:36 -07:00
Tony Luck 3290580285 Pull rationalise-regions into release branch 2005-08-29 15:50:32 -07:00
Tony Luck bcdd3a9114 Pull ngam-maule-steiner into release branch 2005-08-29 15:48:51 -07:00
Tony Luck b946ecbb11 Pull pending-2.6.14 into release branch 2005-08-29 15:48:23 -07:00
Patrick McHardy b0573dea1f [NET]: Introduce SO_{SND,RCV}BUFFORCE socket options
Allows overriding of sysctl_{wmem,rmrm}_max

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:31:35 -07:00
Tony Luck 7115c13bd6 Pull acpi-p-state into release branch 2005-08-29 14:15:10 -07:00
Tony Luck 87dbaaabde Pull lameter-rwsem-limit into release branch 2005-08-29 11:39:18 -07:00
Tony Luck 7ee175f741 Pull mm-context-fix into release branch 2005-08-29 11:38:48 -07:00
Tony Luck dcf8296216 Pull lameter-spinlock-optimization into release branch 2005-08-29 11:38:01 -07:00
Venkatesh Pallipadi 4db8699bcf [IA64] Add ACPI based P-state support
Patch to support P-state transitions on ia64. This driver is based on ACPI,
and uses the ACPI processor driver interface to find out the P-state support
information for the processor. This driver plugs into generic cpufreq
infrastructure.

Once this driver is loaded successfully, ondemand/userspace governor can be
used to change the CPU frequency dynamically based on load or on request from
userspace process.

Refer :
ACPI specification -
      http://www.acpi.info
P-state related PAL calls -
      http://developer.intel.com/design/itanium/downloads/24869909.pdf

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-26 15:09:24 -07:00
Mark Maule 8409668b56 [IA64] altix: Abstract irq_affinity at the sn pci provider
Altix patch to abstract irq_affinity down to the pci provider level since
different SGI hardware implements this in different ways.

Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-26 12:09:01 -07:00
Russ Anderson 5b9021bc58 [IA64] SGI SN remove redundant partition SAL call
Clean up of SGI SN partitioning related code.
The SN_SAL_GET_SN_INFO SAL call returns the partition ID, making
the SN_SAL_SYSCTL_PARTITION_GET SAL call redundant.  Remove sn_partid
and use sn_partition_id.

Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-24 16:26:13 -07:00
Mark Goodwin 60a3ba0bb4 [IA64] - SGI SN hwperf enhancements -
Add a new exported function for determining the nearest node
with CPUs for I/O nodes and fix a bug where the hwperf dynamic
misc device was being registered before misc_init(). 

Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-24 16:24:42 -07:00
Mark Goodwin ecc3c30ae3 [IA64] - SGI SN hwperf enhancements - export_pci_topology
Bugfix to export PCI topology information in /proc/sgi_sn/sn_topology.

Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-24 16:23:39 -07:00
Greg Edwards 1b66776da7 [IA64] clean up sn2 region definitions
Clean up some duplicate region definitions in sn2 code.

Signed-off-by: Greg Edwards <edwardsg@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-24 15:37:26 -07:00
Peter Chubb 0a41e25011 [IA64] Rationalise Region Definitions
Currently, region numbers are defined in several files, with several 
names.  For example, we have REGION_KERNEL in asm/page.h and 
RGN_KERNEL in pgtable.h 
 
We also have address definitions that should depend on the 
RGN_XXX macros, but are currently just long constants. 
 
The following patch reorganises all the definitions so that they have 
the same form (RGN_XXX), are in one place, and that addresses that 
depend on RGN_XXX are derived from them. 

(This is a necessary but not sufficient patch to allow UML-like 
operation on IA64). 

Thanks to David Mosberger for catching the change I missed in mmu_context.h.
 
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au> 
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-24 15:35:41 -07:00
Bjorn Helgaas b5f3616ba7 [IA64] fix IO_SPACE_SPARSE_ENCODING macro ambiguity
Parenthesize "p" to avoid ambiguity.  No callers have a problem today;
this is just to clean up the bad form.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-24 14:55:09 -07:00
Len Brown 84ffa74752 Merge from-linus to-akpm 2005-08-23 22:12:23 -04:00
Christoph Lameter 16592d2698 [IA64] Remove rwsem limitation of 32k waiters
We ran into the limit with the maximum number of waiters at one of our sites.

This patch increases the number of possible waiters from 2^15 to 2^31 by using
a long for the counter in struct rw_semaphore. S390 and alpha already do this.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Kenneth Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-23 10:31:57 -07:00
Tony Luck 3a931d4cca [IA64] remove unused function __ia64_get_io_port_base
Not only was this unused, but its somewhat eccentric declaration
of "static inline const unsigned long" gives gcc4 heartburn.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-18 14:40:00 -07:00
Jack Steiner 470ceb05d9 [IA64-SGI] - New SN hardware support - ptc_fixes
Shub2 provides a much improved mechanism for issuing internode
TLB purges. Add code to support the newer mechanism. There is also 
some debug code (disabled) that is useful for testing.

Collect statistics on the number, type & duration of TLB purges.
This data will be useful for making future improvements in the algorithms.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-17 15:32:24 -07:00
Jack Steiner 68b9753f47 [IA64-SGI] - New SN hardware support - cpu_relax
Add a few missing calls to "hint @pause".

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-17 15:31:53 -07:00
Jack Steiner 3d14487b26 [IA64-SGI] - New SN hardware support - addr_macros
Update the SN address macros so that they work on both shub1 
and shub2. Most of the code to support shub2 was added last year
but this patch fixes a few bugs and adds macros to help generate
both processor-specific physical addresses & numalink physical
addresses. More cleanup & optimization will be done later.


Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-17 15:29:11 -07:00
David Mosberger-Tang badea125d7 [IA64] Fix race in mm-context wrap-around logic.
The patch below should fix a race which could cause stale TLB entries.
Specifically, when 2 CPUs ended up racing for entrance to
wrap_mmu_context().  The losing CPU would find that by the time it
acquired ctx.lock, mm->context already had a valid value, but then it
failed to (re-)check the delayed TLB flushing logic and hence could
end up using a context number when there were still stale entries in
its TLB.  The fix is to check for delayed TLB flushes only after
mm->context is valid (non-zero).  The patch also makes GCC v4.x
happier by defining a non-volatile variant of mm_context_t called
nv_mm_context_t.

Signed-off-by: David Mosberger-Tang <David.Mosberger@acm.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-12 15:05:21 -07:00
Mark Maule c9221da9f2 [IA64-SGI] sn pci provider for TIOCE (pci
Altix patch to add an SN pci provider for TIOCE, which is SGI's 
PCI Express implementation.

Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-11 15:56:31 -07:00
Mark Maule 5b53ed1f2e [IA64-SGI] add support for TIO huge-window
Altix patch to add TIO "huge-window" address support to sn_dma_flush().

Update copyright in affected files.

Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-11 15:37:13 -07:00
Mark Maule 735e60f4c6 [IA64-SGI] abstract force_interrupt() mechanism
Altix patch to abstract the force_interrupt() mechanism away from the
pcibr provider.

Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-11 15:36:39 -07:00
Mark Maule 89963d16dc [IA64-SGI] altix: cosmetic rename of SGI_PCIBR_ERROR
Cosmetic altix patch to rename SGI_PCIBR_ERROR to something more generic and
remove a duplicate #define.

Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-11 15:36:00 -07:00
Colin Ngam 674c6479b7 [IA64-SGI] Altix only: Add PCI Domain number support.
This patch enables PCI Domain numbering on Altix.

Signed-off-by: Colin Ngam <cngam@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-11 15:35:13 -07:00
Christoph Lameter f521089158 [IA64] Spinlock optimizations
1. Nontemporal store for spin unlock.

A nontemporal store will not update the LRU setting for the cacheline. The
cacheline with the lock may therefore be evicted faster from the cpu
caches. Doing so may be useful since it increases the chance that the
exclusive cache line has been evicted when another cpu is trying to
acquire the lock.

The time between dropping and reacquiring a lock on the same cpu is
typically very small so the danger of the cacheline being
evicted is negligible.

2. Avoid semaphore operation in write_unlock and use nontemporal store

write_lock uses a cmpxchg like the regular spin_lock but write_unlock uses
clear_bit which requires a load and then a loop over a cmpxchg. The
following patch makes write_unlock simply use a nontemporal store to clear
the highest 8 bits. We will then still have the lower 3 bytes (24 bits)
left to count the readers.

Doing the byte store will reduce the number of possible readers from 2^31
to 2^24 = 16 million.

These patches were discussed already:

http://marc.theaimsgroup.com/?t=111472054400001&r=1&w=2
http://marc.theaimsgroup.com/?l=linux-ia64&m=111401837707849&w=2

The nontemporal stores will only work using GCC. If a compiler is used
that does not support inline asm then fallback C code is used. This
will preserve the byte store but not be able to do the nontemporal stores.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-10 16:13:10 -07:00
Kenji Kaneshige 1c53e4357e [IA64] fix iosapic_remove build error for !HOTPLUG
This patch removes the following stupid compile error that happens
when CONFIG_HOTPLUG is not defined on ia64.

     arch/ia64/kernel/built-in.o(.text+0x712): In function `acpi_unregister_ioapic':
     : undefined reference to `iosapic_remove'

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-10 15:42:41 -07:00
Len Brown 1d492eb413 [ACPI] Merge acpi-2.6.12 branch into 2.6.13-rc3
Signed-off-by: Len Brown <len.brown@intel.com>
2005-08-05 00:31:42 -04:00
Andrew Morton e92310a930 [ACPI] fix IA64 build warning
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intelc.com>
2005-08-04 22:29:34 -04:00
Robert Love d108919b2b [IA64] inotify: ia64 syscalls.
Attached patch adds the inotify syscalls to ia64.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-27 10:46:12 -07:00
Eric W. Biederman 7c9034735e [PATCH] Add emergency_restart()
When the kernel is working well and we want to restart cleanly
kernel_restart is the function to use.   But in many instances
the kernel wants to reboot when thing are expected to be working
very badly such as from panic or a software watchdog handler.

This patch adds the function emergency_restart() so that
callers can be clear what semantics they expect when calling
restart.  emergency_restart() is expected to be callable
from interrupt context and possibly reliable in even more
trying circumstances.

This is an initial generic implementation for all architectures.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:41 -07:00
Tony Luck 99ad25a313 Auto merge with /home/aegl/GIT/linus 2005-07-13 12:15:43 -07:00
Christoph Lameter 7c2a6c62c0 [IA64] Altix pcibus_to_node implementation
The Altix subarch does not provide node information via ACPI. Instead hooks
are used to fixup pci structures. This patch determines the nodes for Altix
PCI busses.

Remote Bridges:
---------------
Altix supports remote I/O nodes without memory or processors but with bridges.
The TIOCA type of bridge is an AGP bridge and the PROM provides information
about the closest node. That information will be returned by pcibus_to_node.

The TIOCP remote bridge type is a PCI bridge but the PROM does not provide a
closest node id. pcibus_to_node will return -1 for devices on those bridges
meaning that device control structures may be allocated on any node.

Safeguard:
----------
Should the fixups result in invalid node information for a pci controller then
a warning will be printed and pcibus_to_node will return -1.


This patch also fixes the "FIXME" in sn_dma_alloc_coherent. This means that
dma_alloc_coherent will now use alloc_pages_node to allocate memory local to
the node that the PCI device is connected to.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-12 16:12:55 -07:00
Len Brown 5028770a42 [ACPI] merge acpi-2.6.12 branch into latest Linux 2.6.13-rc...
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12 17:21:56 -04:00
Greg Edwards 60a762b6a6 [IA64] remove CONFIG_IA64_SGI_SN_SIM
This patch removes the CONFIG_IA64_SGI_SN_SIM option entirely, allowing
any kernel bootable on sn2 to also be booted in the simulator.

Boot tested on Altix and HP rx2600.

Signed-off-by: Greg Edwards <edwardsg@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-12 14:13:08 -07:00
Christoph Lameter 514604c6d1 [IA64] pcibus_to_node implementation for IA64
pcibus_to_node provides a way for the Linux kernel to identify to which
node a certain pcibus connects to. Allocations of control structures
for devices can then be made on the node where the pci bus is located
to allow local access during interrupt and other device manipulation.

This patch provides a new "node" field in the the pci_controller
structure. The node field will be set based on ACPI information (thanks
to Alex Williamson  <alex.williamson@hp.com for that piece).

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-12 11:04:22 -07:00
David Shaohua Li c9c3e457de [ACPI] PNPACPI vs sound IRQ
http://bugme.osdl.org/show_bug.cgi?id=4016

Written-by: David Shaohua Li <shaohua.li@intel.com>
Acked-by: Adam Belay <abelay@novell.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12 00:03:30 -04:00
Ashok Raj 55e59c511c [ACPI] Evaluate CPEI Processor Override flag
ACPI 3.0 added a Correctable Platform Error Interrupt (CPEI)
Processor Overide flag to MADT.Platform_Interrupt_Source.
Record the processor that was provided as hint from ACPI.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12 00:01:41 -04:00
Kenji Kaneshige 3b5cc09033 [IA64] assign_irq_vector() should not panic
Current assign_irq_vector() will panic if interrupt vectors is running
out. But I think how to handle the case of lack of interrupt vectors
should be handled by the caller of this function. For example, some
PCI devices can raise the interrupt signal via both MSI and I/O
APIC. So even if the driver for these device fails to allocate a
vector for MSI, the driver still has a chance to use I/O APIC based
interrupt. But currently there is no chance for these driver to use
I/O APIC based interrupt because kernel will panic when
assign_irq_vector() fails to allocate interrupt vector.

The following patch changes assign_irq_vector() for ia64 to return
-ENOSPC on error instead of panic (as i386 and x86_64 versions do).

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-11 10:30:07 -07:00
Olaf Hering d0feafbf14 [IA64] remove linux/version.h include from arch/ia64
changing CONFIG_LOCALVERSION rebuilds too much, for no appearent reason.

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-11 09:58:52 -07:00
Tony Luck 8d7e35174d [IA64] fix generic/up builds
Jesse Barnes provided the original version of this patch months ago, but
other changes kept conflicting with it, so it got deferred.  Greg Edwards
dug it out of obscurity just over a week ago, and almost immediately
another conflicting patch appeared (Bob Picco's memory-less nodes).

I've resolved the conflicts and got it running again.  CONFIG_SGI_TIOCX
is set to "y" in defconfig, which causes a Tiger to not boot (oops in
tiocx_init).  But that can be resolved later ... get this in now before it
gets stale again.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-06 18:18:10 -07:00
Prarit Bhargava 7fe4c1b168 [IA64] hotplug/ia64: SN Hotplug Driver - PREEMPT/pcibus_info fix
This patch fixes an issue with the PROM and a kernel running with
CONFIG_PREEMPT enabled.  When CONFIG_PREEMPT is enabled, the size of a
spinlock_t changes -- resulting in the PROM writing to an incorrect location.

Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-06 15:30:25 -07:00
Prarit Bhargava 6f354b014b [IA64] hotplug/ia64: SN Hotplug Driver - SN Hotplug Driver code
This patch is the SGI hotplug driver and additional changes required for
the driver.  These modifications include changes to the SN io_init.c code
for memory management, the inclusion of new SAL calls to enable and disable
PCI slots, and a hotplug-style driver.

Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-06 15:29:53 -07:00
Prarit Bhargava 283c7f6ac6 [IA64] hotplug/ia64: SN Hotplug Driver - new SN PROM version code
This patch is a rewrite of the code to check the PROM version.  The current
code has some deficiences in the way PROM comparisons were made.  The minimum
value of PROM that will boot has also been changed to 4.04.

Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-06 15:29:13 -07:00
Prarit Bhargava c13cf3714f [IA64] hotplug/ia64: SN Hotplug Driver: moving of header files
This patch moves header files out of the arch/ia64/sn directories and into
include/asm-ia64/sn.  These files were being included by other subsystems
and should be under include/asm-ia64/sn.

Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-06 15:26:51 -07:00
Prarit Bhargava cb4cb2cb9b [IA64] hotplug/ia64: SN Hotplug Driver: SN IRQ Fixes
This patch  fixes the SN IRQ code such that cpu affinity and
Hotplug can modify IRQ values.  The sn_irq_info structures are now locked
using a RCU lock mechanism to avoid lock contention in the lost interrupt
WAR code.

Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-06 14:59:44 -07:00
Tony Luck d18bfacff2 Auto merge with /home/aegl/GIT/linus 2005-06-29 15:21:41 -07:00
Peter Chubb a68db763af [IA64] Fix another IA64 preemption problem
There's another problem shown up by Ingo's recent patch to make
smp_processor_id() complain if it's called with preemption enabled.
local_finish_flush_tlb_mm() calls activate_context() in a situation
where it could be rescheduled to another processor.  This patch
disables preemption around the call.

Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-28 10:01:19 -07:00
Bruce Losure c1ffb6a8aa [IA64-SGI] Altix patch to tiocx, add subsys_initcall
This patch fixes an ordering issue between the init code for the
tiocx bus driver and tiocx-related device drivers.   Also adds
a new brick to the list of known FPGA bricks.

Signed-off-by: Bruce Losure <blosure@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-28 09:53:42 -07:00
Prarit Bhargava 92a582ed27 [IA64] sparse cleanup of TIOCA files
This patch is a sparse compile cleanup of tioca_provider.c, sn_hwperf.h, and
tioca_provider.h.  Each of these files had sparse warnings when
compiled.

Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-28 09:50:48 -07:00
Colin Ngam 71030994a7 [IA64-SGI] Fix TIO IOSPACE MMR Addres
This patches provides support on Shub2 for the separate TIO IOSPACE MMR.  This 
patch is SN specific.

Signed-off-by: Colin Ngam <cngam@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-28 09:48:53 -07:00
Jack Steiner 71a5d027c9 [IA64-SGI] - new macros for SGI SN simulator
This patch changes some macros that are used when running kernel on the
SGI simulator.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-28 09:45:45 -07:00
Prarit Bhargava 8e4641b3ee [IA64] sparse cleanup of shub_mmr.h
This patch is a sparse compile cleanup of shub_mmr.h using both the defconfig
and the sn2_defconfig config files.

The issue with this file was the missing usage of __IA64_UL_CONST wrapper.
This wrapper is defined in include/asm-ia64/types.h and wraps a long
constant definition with UL or with nothing depending on its usage in the
kernel.  The missing wrapper caused many sparse compile errors like

        warning: constant 0x0x0000000010000380 so big it is long

Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-28 09:37:16 -07:00
Mark Maule 66b7f8a304 [IA64-SGI] pcdp: add PCDP pci interface support
Resend 2 with changes per Bjorn Helgaas comments.  Changes from original:

+ Change globals to vga_console_iobase/vga_console_membase and make them
  unconditional.
+ Address style-related comments.

Patch to extend the PCDP vga setup code to support PCI io/mem translations
for the legacy vga ioport and ram spaces on architectures (e.g. altix) which
need them.

Summary of the changes:

drivers/firmware/pcdp.c
drivers/firmware/pcdp.h
-----------------------
+ add declaration for the spec-defined PCI interface struct (pcdp_if_pci)
  as well as support macros.

+ extend setup_vga_console() to know about pcdp_if_pci and add a couple of
  globals to hold the io and mem translation offsets if present.

arch/ia64/kernel/setup.c
------------------------
+ tweek early_console_setup() to allow multiple early console setup routines
  to be called.

include/asm-ia64/vga.h
----------------------
+ make VGA_MAP_MEM vga_console_membase aware

Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-28 09:09:06 -07:00
Greg KH 8644d2a42b Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2005-06-27 22:07:56 -07:00
Andrew Morton bb4a61b6ea [PATCH] PCI: fix up errors after dma bursting patch and CONFIG_PCI=n
With CONFIG_PCI=n:

In file included from include/linux/pci.h:917,
                 from lib/iomap.c:6:
include/asm/pci.h:104: warning: `enum pci_dma_burst_strategy' declared inside parameter list
include/asm/pci.h:104: warning: its scope is only this definition or declaration, which is probably not what you want.
include/asm/pci.h: In function `pci_dma_burst_advice':
include/asm/pci.h:106: dereferencing pointer to incomplete type
include/asm/pci.h:106: `PCI_DMA_BURST_INFINITY' undeclared (first use in this function)
include/asm/pci.h:106: (Each undeclared identifier is reported only once
include/asm/pci.h:106: for each function it appears in.)
make[1]: *** [lib/iomap.o] Error 1

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27 21:52:46 -07:00
David S. Miller e24c2d963a [PATCH] PCI: DMA bursting advice
After seeing, at best, "guesses" as to the following kind
of information in several drivers, I decided that we really
need a way for platforms to specifically give advice in this
area for what works best with their PCI controller implementation.

Basically, this new interface gives DMA bursting advice on
PCI.  There are three forms of the advice:

1) Burst as much as possible, it is not necessary to end bursts
   on some particular boundary for best performance.

2) Burst on some byte count multiple.  A DMA burst to some multiple of
   number of bytes may be done, but it is important to end the burst
   on an exact multiple for best performance.

   The best example of this I am aware of are the PPC64 PCI
   controllers, where if you end a burst mid-cacheline then
   chip has to refetch the data and the IOMMU translations
   which hurts performance a lot.

3) Burst on a single byte count multiple.  Bursts shall end
   exactly on the next multiple boundary for best performance.

   Sparc64 and Alpha's PCI controllers operate this way.  They
   disconnect any device which tries to burst across a cacheline
   boundary.

   Actually, newer sparc64 PCI controllers do not have this behavior.
   That is why the "pdev" is passed into the interface, so I can
   add code later to check which PCI controller the system is using
   and give advice accordingly.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27 21:52:45 -07:00
Kenji Kaneshige 0e888adc41 [PATCH] ACPI based I/O APIC hot-plug: ia64 support
This is an ia64 implementation of acpi_register_ioapic() and
acpi_unregister_ioapic() interfaces.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27 21:52:44 -07:00
Keshavamurthy Anil S c7b645f934 [PATCH] kprobes/ia64: refuse kprobe on ivt code
Not safe to insert kprobes on IVT code.

This patch checks to see if the address on which Kprobes is being inserted is
in ivt code and if it is in ivt code then refuse to register kprobe.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Acked-by: David Mosberger <davidm@napali.hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:23:54 -07:00
Rusty Lynch 9508dbfe39 [PATCH] Return probe redesign: ia64 specific implementation
The following patch implements function return probes for ia64 using
the revised design.  With this new design we no longer need to do some
of the odd hacks previous required on the last ia64 return probe port
that I sent out for comments.

Note that this new implementation still does not resolve the problem noted
by Keith Owens where backtrace data is lost after a return probe is hit.

Changes include:
 * Addition of kretprobe_trampoline to act as a dummy function for instrumented
   functions to return to, and for the return probe infrastructure to place
   a kprobe on on, gaining control so that the return probe handler
   can be called, and so that the instruction pointer can be moved back
   to the original return address.
 * Addition of arch_init(), allowing a kprobe to be registered on
   kretprobe_trampoline
 * Addition of trampoline_probe_handler() which is used as the pre_handler
   for the kprobe inserted on kretprobe_implementation.  This is the function
   that handles the details for calling the return probe handler function
   and returning control back at the original return address
 * Addition of arch_prepare_kretprobe() which is setup as the pre_handler
   for a kprobe registered at the beginning of the target function by
   kernel/kprobes.c so that a return probe instance can be setup when
   a caller enters the target function.  (A return probe instance contains
   all the needed information for trampoline_probe_handler to do it's job.)
 * Hooks added to the exit path of a task so that we can cleanup any left-over
   return probe instances (i.e. if a task dies while inside a targeted function
   then the return probe instance was reserved at the beginning of the function
   but the function never returns so we need to mark the instance as unused.)

Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:23:53 -07:00
Ananth N Mavinakayanahalli 9ec4b1f356 [PATCH] kprobes: fix single-step out of line - take2
Now that PPC64 has no-execute support, here is a second try to fix the
single step out of line during kprobe execution.  Kprobes on x86_64 already
solved this problem by allocating an executable page and using it as the
scratch area for stepping out of line.  Reuse that.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:23:52 -07:00
Jens Axboe 22e2c507c3 [PATCH] Update cfq io scheduler to time sliced design
This updates the CFQ io scheduler to the new time sliced design (cfq
v3).  It provides full process fairness, while giving excellent
aggregate system throughput even for many competing processes.  It
supports io priorities, either inherited from the cpu nice value or set
directly with the ioprio_get/set syscalls.  The latter closely mimic
set/getpriority.

This import is based on my latest from -mm.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 14:33:29 -07:00
Nick Piggin 4866cde064 [PATCH] sched: cleanup context switch locking
Instead of requiring architecture code to interact with the scheduler's
locking implementation, provide a couple of defines that can be used by the
architecture to request runqueue unlocked context switches, and ask for
interrupts to be enabled over the context switch.

Also replaces the "switch_lock" used by these architectures with an oncpu
flag (note, not a potentially slow bitflag).  This eliminates one bus
locked memory operation when context switching, and simplifies the
task_running function.

Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:43 -07:00
Nick Piggin 687f1661d3 [PATCH] sched: sched tuning
Do some basic initial tuning.

Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:42 -07:00
David Mosberger-Tang e608a8072b [IA64] Fix pfn_to_nid() so the kernel compiles again for !CONFIG_DISCONTIGMEM.
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-06-23 14:52:51 -07:00
Stephen Rothwell 0d77e5a2c2 [PATCH] compat: introduce compat_time_t
This patch is based on work by Carlos O'Donell and Matthew Wilcox.  It
introduces/updates the compat_time_t type and uses it for compat siginfo
structures.  I have built this on ppc64 and x86_64.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:32 -07:00
Jan Beulich 11c80c8367 [PATCH] adjust per_cpu definition in non-SMP case
Fix (in the architectures I'm actually building for) the UP definition of
per_cpu so that the cpu specified may be any expression, not just an
identifier or a suffix expression.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:28 -07:00
Yoav Zach ef3daeda7b [PATCH] Don't force O_LARGEFILE for 32 bit processes on ia64
In ia64 kernel, the O_LARGEFILE flag is forced when opening a file.  This
is problematic for execution of 32 bit processes, which are not largefile
aware, either by SW emulation or by HW execution.

For such processes, the problem is two-fold:

1) When trying to open a file that is larger than 4G
   the operation should fail, but it's not
2) Writing to offset larger than 4G should fail, but
   it's not

The proposed patch takes advantage of the way 32 bit processes are
identified in ia64 systems.  Such processes have PER_LINUX32 for their
personality.  With the patch, the ia64 kernel will not enforce the
O_LARGEFILE flag if the current process has PER_LINUX32 set.  The behavior
for all other architectures remains unchanged.

Signed-off-by: Yoav Zach <yoav.zach@intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:28 -07:00
Anil S Keshavamurthy 1674eafcbd [PATCH] Kprobes IA64: cmp ctype unc support
The current Kprobes when patching the original instruction with the break
instruction tries to retain the original qualifying predicate(qp), however
for cmp.crel.ctype where ctype == unc, which is a special instruction
always needs to be executed irrespective of qp.  Hence, if the instruction
we are patching is of this type, then we should not copy the original qp to
the break instruction, this is because we always want the break fault to
happen so that we can emulate the instruction.

This patch is based on the feedback given by David Mosberger

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:23 -07:00
Rusty Lynch 8bc76772ad [PATCH] Kprobes ia64 cleanup
A cleanup of the ia64 kprobes implementation such that all of the bundle
manipulation logic is concentrated in arch_prepare_kprobe().

With the current design for kprobes, the arch specific code only has a
chance to return failure inside the arch_prepare_kprobe() function.

This patch moves all of the work that was happening in arch_copy_kprobe()
and most of the work that was happening in arch_arm_kprobe() into
arch_prepare_kprobe().  By doing this we can add further robustness checks
in arch_arm_kprobe() and refuse to insert kprobes that will cause problems.

Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:23 -07:00
Anil S Keshavamurthy cd2675bf65 [PATCH] Kprobes/IA64: support kprobe on branch/call instructions
This patch is required to support kprobe on branch/call instructions.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:23 -07:00
Anil S Keshavamurthy b2761dc262 [PATCH] Kprobes/IA64: architecture specific JProbes support
This patch adds IA64 architecture specific JProbes support on top of Kprobes

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:22 -07:00
Anil S Keshavamurthy fd7b231ff9 [PATCH] Kprobes/IA64: arch specific handling
This is an IA64 arch specific handling of Kprobes

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:22 -07:00
Anil S Keshavamurthy 7213b25218 [PATCH] Kprobes/IA64: kdebug die notification mechanism
As many of you know that kprobes exist in the main line kernel for various
architecture including i386, x86_64, ppc64 and sparc64.  Attached patches
following this mail are a port of Kprobes and Jprobes for IA64.

I have tesed this patches for kprobes and Jprobes and this seems to work fine.
 I have tested this patch by inserting kprobes on various slots and various
templates including various types of branch instructions.

I have also tested this patch using the tool
http://marc.theaimsgroup.com/?l=linux-kernel&m=111657358022586&w=2 and the
kprobes for IA64 works great.

Here is list of TODO things and pathes for the same will appear soon.

1) Support kprobes on "mov r1=ip" type of instruction
2) Support Kprobes and Jprobes to exist on the same address
3) Support Return probes
3) Architecture independent cleanup of kprobes

This patch adds the kdebug die notification mechanism needed by Kprobes.

For break instruction on Branch type slot, imm21 is ignored and value
zero is placed in IIM register, hence we need to handle kprobes
for switch case zero.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com>

From: Rusty Lynch <rusty.lynch@intel.com>

At the point in traps.c where we recieve a break with a zero value, we can
not say if the break was a result of a kprobe or some other debug facility.

This simple patch changes the informational string to a more correct "break
0" value, and applies to the 2.6.12-rc2-mm2 tree with all the kprobes
patches that were just recently included for the next mm cut.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:22 -07:00
Jesper Juhl dcd497f99a [PATCH] streamline preempt_count type across archs
The preempt_count member of struct thread_info is currently either defined
as int, unsigned int or __s32 depending on arch.  This patch makes the type
of preempt_count an int on all archs.

Having preempt_count be an unsigned type prevents the catching of
preempt_count < 0 bugs, and using int on some archs and __s32 on others is
not exactely "neat" - much nicer when it's just int all over.

A previous version of this patch was already ACK'ed by Robert Love, and the
only change in this version of the patch compared to the one he ACK'ed is
that this one also makes sure the preempt_count member is consistently
commented.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:19 -07:00
Christoph Lameter b5d23e5b8c [PATCH] ia64: Selectable Timer Interrupt Frequency
It allows a selectable timer interrupt frequency of 100, 250 and 1000 HZ.
Reducing the timer frequency may have important performance benefits on
large systems.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:10 -07:00
Bob Picco 400e65146c [PATCH] ia64: pfn_to_nid() implementation
pfn_to_nid is undefined.  We haven't had this interface on ia64.  The
sys_mbind patches need it.

Oh, the paddr_to_nid call could fail when DISCONTIG+NUMA is configured
because there isn't any ACPI SRAT NUMA information.

Signed-off-by: Bob Picco <bob.picco@hp.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:19 -07:00
Jes Sorensen 65ed0b337b [PATCH] SN2 XPC build patches
This patch contains the bits to make the XPC code use the uncached
allocator rather than calling into the mspec driver.  It also includes the
mspec.h header which is required to build the XPC modules.

Signed-off-by: Jes Sorensen <jes@wildopensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:18 -07:00
Jes Sorensen f14f75b811 [PATCH] ia64 uncached alloc
This patch contains the ia64 uncached page allocator and the generic
allocator (genalloc).  The uncached allocator was formerly part of the SN2
mspec driver but there are several other users of it so it has been split
off from the driver.

The generic allocator can be used by device driver to manage special memory
etc.  The generic allocator is based on the allocator from the sym53c8xx_2
driver.

Various users on ia64 needs uncached memory.  The SGI SN architecture requires
it for inter-partition communication between partitions within a large NUMA
cluster.  The specific user for this is the XPC code.  Another application is
large MPI style applications which use it for synchronization, on SN this can
be done using special 'fetchop' operations but it also benefits non SN
hardware which may use regular uncached memory for this purpose.  Performance
of doing this through uncached vs cached memory is pretty substantial.  This
is handled by the mspec driver which I will push out in a seperate patch.

Rather than creating a specific allocator for just uncached memory I came up
with genalloc which is a generic purpose allocator that can be used by device
drivers and other subsystems as they please.  For instance to handle onboard
device memory.  It was derived from the sym53c7xx_2 driver's allocator which
is also an example of a potential user (I am refraining from modifying sym2
right now as it seems to have been under fairly heavy development recently).

On ia64 memory has various properties within a granule, ie.  it isn't safe to
access memory as uncached within the same granule as currently has memory
accessed in cached mode.  The regular system therefore doesn't utilize memory
in the lower granules which is mixed in with device PAL code etc.  The
uncached driver walks the EFI memmap and pulls out the spill uncached pages
and sticks them into the uncached pool.  Only after these chunks have been
utilized, will it start converting regular cached memory into uncached memory.
Hence the reason for the EFI related code additions.

Signed-off-by: Jes Sorensen <jes@wildopensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:18 -07:00
David Gibson 63551ae0fe [PATCH] Hugepage consolidation
A lot of the code in arch/*/mm/hugetlbpage.c is quite similar.  This patch
attempts to consolidate a lot of the code across the arch's, putting the
combined version in mm/hugetlb.c.  There are a couple of uglyish hacks in
order to covert all the hugepage archs, but the result is a very large
reduction in the total amount of code.  It also means things like hugepage
lazy allocation could be implemented in one place, instead of six.

Tested, at least a little, on ppc64, i386 and x86_64.

Notes:
	- this patch changes the meaning of set_huge_pte() to be more
	  analagous to set_pte()
	- does SH4 need s special huge_ptep_get_and_clear()??

Acked-by: William Lee Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:15 -07:00
Martin Hicks 753ee72896 [PATCH] VM: early zone reclaim
This is the core of the (much simplified) early reclaim.  The goal of this
patch is to reclaim some easily-freed pages from a zone before falling back
onto another zone.

One of the major uses of this is NUMA machines.  With the default allocator
behavior the allocator would look for memory in another zone, which might be
off-node, before trying to reclaim from the current zone.

This adds a zone tuneable to enable early zone reclaim.  It is selected on a
per-zone basis and is turned on/off via syscall.

Adding some extra throttling on the reclaim was also required (patch
4/4).  Without the machine would grind to a crawl when doing a "make -j"
kernel build.  Even with this patch the System Time is higher on
average, but it seems tolerable.  Here are some numbers for kernbench
runs on a 2-node, 4cpu, 8Gig RAM Altix in the "make -j" run:

			wall  user   sys   %cpu  ctx sw.  sleeps
			----  ----   ---   ----   ------  ------
No patch		1009  1384   847   258   298170   504402
w/patch, no reclaim     880   1376   667   288   254064   396745
w/patch & reclaim       1079  1385   926   252   291625   548873

These numbers are the average of 2 runs of 3 "make -j" runs done right
after system boot.  Run-to-run variability for "make -j" is huge, so
these numbers aren't terribly useful except to seee that with reclaim
the benchmark still finishes in a reasonable amount of time.

I also looked at the NUMA hit/miss stats for the "make -j" runs and the
reclaim doesn't make any difference when the machine is thrashing away.

Doing a "make -j8" on a single node that is filled with page cache pages
takes 700 seconds with reclaim turned on and 735 seconds without reclaim
(due to remote memory accesses).

The simple zone_reclaim syscall program is at
http://www.bork.org/~mort/sgi/zone_reclaim.c

Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:14 -07:00
Ingo Molnar 39c715b717 [PATCH] smp_processor_id() cleanup
This patch implements a number of smp_processor_id() cleanup ideas that
Arjan van de Ven and I came up with.

The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
spaghetti was hard to follow both on the implementational and on the
usage side.

Some of the complexity arose from picking wrong names, some of the
complexity comes from the fact that not all architectures defined
__smp_processor_id.

In the new code, there are two externally visible symbols:

 - smp_processor_id(): debug variant.

 - raw_smp_processor_id(): nondebug variant. Replaces all existing
   uses of _smp_processor_id() and __smp_processor_id(). Defined
   by every SMP architecture in include/asm-*/smp.h.

There is one new internal symbol, dependent on DEBUG_PREEMPT:

 - debug_smp_processor_id(): internal debug variant, mapped to
                             smp_processor_id().

Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
lib/smp_processor_id.c file.  All related comments got updated and/or
clarified.

I have build/boot tested the following 8 .config combinations on x86:

 {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}

I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT.  (Other
architectures are untested, but should work just fine.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:13 -07:00
Peter Chubb 05062d96a2 [PATCH] ia64: fix floating-point preemption problem
There've been reports of problems with CONFIG_PREEMPT=y and the high
floating point partition.  This is caused by the possibility of preemption
and rescheduling on a different processor while saving or restioirng the
high partition.

The only places where the FPU state is touched are in ptrace, in
switch_to(), and where handling a floating-point exception.  In switch_to()
preemption is off.  So it's only in trap.c and ptrace.c that we need to
prevent preemption.

Here is a patch that adds commentary to make the conditions clear, and adds
appropriate preempt_{en,dis}able() calls to make it so.  In trap.c I use
preempt_enable_no_resched(), as we're about to return to user space where
the preemption flag will be checked anyway.

Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-08 16:21:14 -07:00