Commit Graph

412 Commits (c46261de0d98372112d8edf16f74ce418a268d46)

Author SHA1 Message Date
Peter Zijlstra f20786ff4d lockstat: core infrastructure
Introduce the core lock statistics code.

Lock statistics provides lock wait-time and hold-time (as well as the count
of corresponding contention and acquisitions events). Also, the first few
call-sites that encounter contention are tracked.

Lock wait-time is the time spent waiting on the lock. This provides insight
into the locking scheme, that is, a heavily contended lock is indicative of
a too coarse locking scheme.

Lock hold-time is the duration the lock was held, this provides a reference for
the wait-time numbers, so they can be put into perspective.

  1)
    lock
  2)
    ... do stuff ..
    unlock
  3)

The time between 1 and 2 is the wait-time. The time between 2 and 3 is the
hold-time.

The lockdep held-lock tracking code is reused, because it already collects locks
into meaningful groups (classes), and because it is an existing infrastructure
for lock instrumentation.

Currently lockdep tracks lock acquisition with two hooks:

  lock()
    lock_acquire()
    _lock()

 ... code protected by lock ...

  unlock()
    lock_release()
    _unlock()

We need to extend this with two more hooks, in order to measure contention.

  lock_contended() - used to measure contention events
  lock_acquired()  - completion of the contention

These are then placed the following way:

  lock()
    lock_acquire()
    if (!_try_lock())
      lock_contended()
      _lock()
      lock_acquired()

 ... do locked stuff ...

  unlock()
    lock_release()
    _unlock()

(Note: the try_lock() 'trick' is used to avoid instrumenting all platform
       dependent lock primitive implementations.)

It is also possible to toggle the two lockdep features at runtime using:

  /proc/sys/kernel/prove_locking
  /proc/sys/kernel/lock_stat

(esp. turning off the O(n^2) prove_locking functionaliy can help)

[akpm@linux-foundation.org: build fixes]
[akpm@linux-foundation.org: nuke unneeded ifdefs]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:49 -07:00
Kay Sievers 60a96a5956 Driver core: accept all valid action-strings in uevent-trigger
This allows the uevent file to handle any type of uevent action to be
triggered by userspace instead of just the "add" uevent.


Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-18 15:49:49 -07:00
Jeremy Fitzhardinge 86313c488a usermodehelper: Tidy up waiting
Rather than using a tri-state integer for the wait flag in
call_usermodehelper_exec, define a proper enum, and use that.  I've
preserved the integer values so that any callers I've missed should
still work OK.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: David Howells <dhowells@redhat.com>
2007-07-18 08:47:40 -07:00
Jeremy Fitzhardinge d84d1cc764 add argv_split()
argv_split() is a helper function which takes a string, splits it at
whitespace, and returns a NULL-terminated argv vector.  This is
deliberately simple - it does no quote processing of any kind.

[ Seems to me that this is something which is already being done in
  the kernel, but I couldn't find any other implementations, either to
  steal or replace.  Keep an eye out. ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
2007-07-18 08:47:40 -07:00
Jan Nikitenko ad241528c4 CRC7 support
Add CRC7 routines, used for example in MMC over SPI communication.
Kerneldoc updates

[akpm@linux-foundation.org: fix funny mix of const and non-const]
Signed-off-by: Jan Nikitenko <jan.nikitenko@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:04 -07:00
Christoph Lameter 94f6030ca7 Slab allocators: Replace explicit zeroing with __GFP_ZERO
kmalloc_node() and kmem_cache_alloc_node() were not available in a zeroing
variant in the past.  But with __GFP_ZERO it is possible now to do zeroing
while allocating.

Use __GFP_ZERO to remove the explicit clearing of memory via memset whereever
we can.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:02 -07:00
Linus Torvalds a54890d7a6 Make check_signature depend on CONFIG_HAS_IOMEM
This should avoid build problems on architectures without a "readb()",
that got bitten by check_signature() being uninlined.

Noted by Heiko Carstens.

Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 16:50:01 -07:00
Denis Vlasenko 4277eedd79 vsprintf.c: optimizing, part 2: base 10 conversion speedup, v2
Optimize integer-to-string conversion in vsprintf.c for base 10.  This is
by far the most used conversion, and in some use cases it impacts
performance.  For example, top reads /proc/$PID/stat for every process, and
with 4000 processes decimal conversion alone takes noticeable time.

Using code from

http://www.cs.uiowa.edu/~jones/bcd/decimal.html
(with permission from the author, Douglas W. Jones)

binary-to-decimal-string conversion is done in groups of five digits at
once, using only additions/subtractions/shifts (with -O2; -Os throws in
some multiply instructions).

On i386 arch gcc 4.1.2 -O2 generates ~500 bytes of code.

This patch is run tested. Userspace benchmark/test is also attached.
I tested it on PIII and AMD64 and new code is generally ~2.5 times
faster. On AMD64:

# ./vsprintf_verify-O2
Original decimal conv: .......... 151 ns per iteration
Patched decimal conv:  .......... 62 ns per iteration
Testing correctness
12895992590592 ok...        [Ctrl-C]
# ./vsprintf_verify-O2
Original decimal conv: .......... 151 ns per iteration
Patched decimal conv:  .......... 62 ns per iteration
Testing correctness
26025406464 ok...        [Ctrl-C]

More realistic test: top from busybox project was modified to
report how many us it took to scan /proc (this does not account
any processing done after that, like sorting process list),
and then I test it with 4000 processes:

#!/bin/sh
i=4000
while test $i != 0; do
    sleep 30 &
    let i--
done
busybox top -b -n3 >/dev/null

on unpatched kernel:

top: 4120 processes took 102864 microseconds to scan
top: 4120 processes took 91757 microseconds to scan
top: 4120 processes took 92517 microseconds to scan
top: 4120 processes took 92581 microseconds to scan

on patched kernel:

top: 4120 processes took 75460 microseconds to scan
top: 4120 processes took 66451 microseconds to scan
top: 4120 processes took 67267 microseconds to scan
top: 4120 processes took 67618 microseconds to scan

The speedup comes from much faster generation of /proc/PID/stat
by sprintf() calls inside the kernel.

Signed-off-by: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:52 -07:00
Denis Vlasenko b39a734097 vsprintf.c: optimizing, part 1 (easy and obvious stuff)
* There is no point in having full "0...9a...z" constant vector,
  if we use only "0...9a...f" (and "x" for "0x").

* Post-decrement usually needs a few more instructions, so use
  pre decrement instead where makes sense:
-       while (i < precision--) {
+       while (i <= --precision) {

* if base != 10 (=> base 8 or 16), we can avoid using division
  in a loop and use mask/shift, obtaining much faster conversion.
  (More complex optimization for base 10 case is in the second patch).

Overall, size vsprintf.o shows ~80 bytes smaller text section
with this patch applied.

Signed-off-by: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:52 -07:00
Heiko Carstens 608e261968 generic bug: use show_regs() instead of dump_stack()
The current generic bug implementation has a call to dump_stack() in case a
WARN_ON(whatever) gets hit.  Since report_bug(), which calls dump_stack(),
gets called from an exception handler we can do better: just pass the
pt_regs structure to report_bug() and pass it to show_regs() in case of a
warning.  This will give more debug informations like register contents,
etc...  In addition this avoids some pointless lines that dump_stack()
emits, since it includes a stack backtrace of the exception handler which
is of no interest in case of a warning.  E.g.  on s390 the following lines
are currently always present in a stack backtrace if dump_stack() gets
called from report_bug():

 [<000000000001517a>] show_trace+0x92/0xe8)
 [<0000000000015270>] show_stack+0xa0/0xd0
 [<00000000000152ce>] dump_stack+0x2e/0x3c
 [<0000000000195450>] report_bug+0x98/0xf8
 [<0000000000016cc8>] illegal_op+0x1fc/0x21c
 [<00000000000227d6>] sysc_return+0x0/0x10

Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:51 -07:00
Andrew Morton cc2ea416b2 uninline check_signature()
This is a rather bizarre thing to have inlined in io.h.  Stick it in lib/
instead.

While we're there, despaghetti it a bit, and fix its off-by-one behaviour when
passed a zero length.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:50 -07:00
Andrew Morton b4ef0296f2 percpu_counters: use for_each_online_cpu()
Now that we have implemented hotunplug-time counter spilling,
percpu_counter_sum() only needs to look at online CPUs.

Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:41 -07:00
Andrew Morton c67ad917cb percpu_counters(): use cpu notifiers
per-cpu counters presently must iterate over all possible CPUs in the
exhaustive percpu_counter_sum().

But it can be much better to only iterate over the presently-online CPUs.  To
do this, we must arrange for an offlined CPU's count to be spilled into the
counter's central count.

We can do this for all percpu_counters in the machine by linking them into a
single global list and walking that list at CPU_DEAD time.

(I hope.  Might have race windows in which the percpu_counter_sum() count is
inaccurate?)

Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:41 -07:00
Christoph Lameter f0630fff54 SLUB: support slub_debug on by default
Add a new configuration variable

CONFIG_SLUB_DEBUG_ON

If set then the kernel will be booted by default with slab debugging
switched on. Similar to CONFIG_SLAB_DEBUG. By default slab debugging
is available but must be enabled by specifying "slub_debug" as a
kernel parameter.

Also add support to switch off slab debugging for a kernel that was
built with CONFIG_SLUB_DEBUG_ON. This works by specifying

slub_debug=-

as a kernel parameter.

Dave Jones wanted this feature.
http://marc.info/?l=linux-kernel&m=118072189913045&w=2

[akpm@linux-foundation.org: clean up switch statement]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Kristian Hoegsberg 23936cc0b5 lib: add idr_remove_all
Remove all ids from the given idr tree.  idr_destroy() only frees up
unused, cached idp_layers, but this function will remove all id mappings
and leave all idp_layers unused.

A typical clean-up sequence for objects stored in an idr tree, will use
idr_for_each() to free all objects, if necessay, then idr_remove_all() to
remove all ids, and idr_destroy() to free up the cached idr_layers.

Signed-off-by: Kristian Hoegsberg <krh@redhat.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:34 -07:00
Kristian Hoegsberg 96d7fa421e lib: add idr_for_each()
This patch adds an iterator function for the idr data structure.  Compared
to just iterating through the idr with an integer and idr_find, this
iterator is (almost, but not quite) linear in the number of elements, as
opposed to the number of integers in the range covered by the idr.  This
makes a difference for sparse idrs, but more importantly, it's a nicer way
to iterate through the elements.

The drm subsystem is moving to idr for tracking contexts and drawables, and
with this change, we can use the idr exclusively for tracking these
resources.

[akpm@linux-foundation.org: fix comment]
Signed-off-by: Kristian Hoegsberg <krh@redhat.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:34 -07:00
Linus Torvalds 2d896c780d Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: (37 commits)
  [XFS] Fix lockdep annotations for xfs_lock_inodes
  [LIB]: export radix_tree_preload()
  [XFS] Fix XFS_IOC_FSBULKSTAT{,_SINGLE} & XFS_IOC_FSINUMBERS in compat mode
  [XFS] Compat ioctl handler for handle operations
  [XFS] Compat ioctl handler for XFS_IOC_FSGEOMETRY_V1.
  [XFS] Clean up function name handling in tracing code
  [XFS] Quota inode has no parent.
  [XFS] Concurrent Multi-File Data Streams
  [XFS] Use uninitialized_var macro to stop warning about rtx
  [XFS] XFS should not be looking at filp reference counts
  [XFS] Use is_power_of_2 instead of open coding checks
  [XFS] Reduce shouting by removing unnecessary macros from dir2 code.
  [XFS] Simplify XFS min/max macros.
  [XFS] Kill off xfs_count_bits
  [XFS] Cancel transactions on xfs_itruncate_start error.
  [XFS] Use do_div() on 64 bit types.
  [XFS] Fix remount,readonly path to flush everything correctly.
  [XFS] Cleanup inode extent size hint extraction
  [XFS] Prevent ENOSPC from aborting transactions that need to succeed
  [XFS] Prevent deadlock when flushing inodes on unmount
  ...
2007-07-15 16:43:43 -07:00
David Chinner d7f0923d83 [LIB]: export radix_tree_preload()
XFS filestreams functionality uses radix trees and the preload
functions. XFS can be built as a module and hence we need
radix_tree_preload() exported. radix_tree_preload_end() is a
static inline, so it doesn't need exporting.

Signed-Off-By: Dave Chinner <dgc@sgi.com>
Signed-Off-By: Tim Shimmin <tes@sgi.com>
2007-07-14 16:05:04 +10:00
Tejun Heo 608e266a2d sysfs: make kobj point to sysfs_dirent instead of dentry
As kobj sysfs dentries and inodes are gonna be made reclaimable,
dentry can't be used as naming token for sysfs file/directory, replace
kobj->dentry with kobj->sd.  The only external interface change is
shadow directory handling.  All other changes are contained in kobj
and sysfs.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:08 -07:00
Tejun Heo 72dba584b6 ida: implement idr based id allocator
Implement idr based id allocator.  ida is used the same way idr is
used but lacks id -> ptr translation and thus consumes much less
memory.  struct ida_bitmap is attached as leaf nodes to idr tree which
is managed by the idr code.  Each ida_bitmap is 128bytes long and
contains slightly less than a thousand slots.

ida is more aggressive with releasing extra resources acquired using
ida_pre_get().  After every successful id allocation, ida frees one
reserved idr_layer if possible.  Reserved ida_bitmap is not freed
automatically but only one ida_bitmap is reserved and it's almost
always used right away.  Under most circumstances, ida won't hold on
to memory for too long which isn't actively used.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:03 -07:00
Tejun Heo e33ac8bdb0 idr: separate out idr_mark_full()
Separate out idr_mark_full() from sub_alloc() and make marking the
allocated slot full the responsibility of idr_get_new_above_int().

Allocation part of idr_get_new_above_int() is renamed to
idr_get_empty_slot().  New idr_get_new_above_int() allocates a slot
using the function, install the user pointer and marks it full using
idr_mark_full().

This change doesn't introduce any behavior change.  This will be
used by ida.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:03 -07:00
Tejun Heo 7aae6dd80e idr: fix obscure bug in allocation path
In sub_alloc(), when bitmap search fails, it goes up one level to
continue search.  This is done by updating the id cursor and searching
the upper level again.  If the cursor was at the end of the upper
level, we need to go further than that.

This wasn't implemented and when that happens the part of the cursor
which indexes into the upper level wraps and sub_alloc() ends up
searching the wrong bitmap.  It allocates id which doesn't match the
actual slot.

This patch fixes this by restarting from the top if the search needs
to go higher than one level.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:02 -07:00
Kay Sievers 80f03e349f Driver core: add missing kset uevent
We get uevents for a bus/class going away, but not one registering.
Add the missing uevent in kset_register(), which will send an
event for a new bus/class. Suppress all unwanted uevents for bus
subdirectories like /bus/*/devices/, /bus/*/drivers/.

Now we get for module usbcore:
  add      /module/usbcore (module)
  add      /bus/usb (bus)
  add      /class/usb_host (class)
  add      /bus/usb/drivers/hub (drivers)
  add      /bus/usb/drivers/usb (drivers)
  remove   /bus/usb/drivers/usb (drivers)
  remove   /bus/usb/drivers/hub (drivers)
  remove   /class/usb_host (class)
  remove   /bus/usb (bus)
  remove   /module/usbcore (module)

instead of:
  add      /module/usbcore (module)
  add      /bus/usb/drivers/hub (drivers)
  add      /bus/usb/drivers/usb (drivers)
  remove   /bus/usb/drivers/usb (drivers)
  remove   /bus/usb/drivers/hub (drivers)
  remove   /class/usb_host (class)
  remove   /bus/usb/drivers (bus)
  remove   /bus/usb/devices (bus)
  remove   /bus/usb (bus)
  remove   /module/usbcore (module)

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:01 -07:00
Richard Purdie 64c70b1cf4 Add LZO1X algorithm to the kernel
This is a hybrid version of the patch to add the LZO1X compression
algorithm to the kernel.  Nitin and myself have merged the best parts of
the various patches to form this version which we're both happy with (and
are jointly signing off).

The performance of this version is equivalent to the original minilzo code
it was based on.  Bytecode comparisons have also been made on ARM, i386 and
x86_64 with favourable results.

There are several users of LZO lined up including jffs2, crypto and reiser4
since its much faster than zlib.

Signed-off-by: Nitin Gupta <nitingupta910@gmail.com>
Signed-off-by: Richard Purdie <rpurdie@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-10 17:51:13 -07:00
Ingo Molnar b642b6d3fa sched: scheduler debugging, enable in Kconfig
enable CONFIG_SCHED_DEBUG in lib/Kconfig.debug.

the runtime overhead of this option is very small.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-09 18:52:00 +02:00
Linus Torvalds 837525e344 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6:
  firmware: remove orphaned Email
  kobject: use the proper printk level for kobject error
  Driver core: kill unused code
  Driver core: keep PHYSDEV for old struct class_device
  update Documentation/driver-model/platform.txt
2007-06-08 18:11:47 -07:00
Randy Dunlap c790923499 hexdump: more output formatting
Add a prefix string parameter.  Callers are responsible for any string
length/alignment that they want to see in the output.  I.e., callers should
pad strings to achieve alignment if they want that.

Add rowsize parameter.  This is the number of raw data bytes to be printed
per line.  Must be 16 or 32.

Add a groupsize parameter.  This allows callers to dump values as 1-byte,
2-byte, 4-byte, or 8-byte numbers.  Default is 1-byte numbers.  If the
total length is not an even multiple of groupsize, 1-byte numbers are
printed.

Add an "ascii" output parameter.  This causes ASCII data output following
the hex data output.

Clean up some doc examples.

Align the ASCII output on all lines that are produced by one call.

Add a new interface, print_hex_dump_bytes(), that is a shortcut to
print_hex_dump(), using default parameter values to print 16 bytes in
byte-size chunks of hex + ASCII output, using printk level KERN_DEBUG.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-06-08 17:23:34 -07:00
Greg Kroah-Hartman 5c73a3fba6 kobject: use the proper printk level for kobject error
Thanks to Jean Delvare <khali@linux-fr.org> for pointing it out to me.

Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-06-08 12:41:08 -07:00
Ingo Molnar c1a834dc70 timer stats: speedups
Make timer-stats have almost zero overhead when enabled in the config but
not used.  (this way distros can enable it more easily)

Also update the documentation about overhead of timer_stats - it was
written for the first version which had a global lock and a linear list
walk based lookup ;-)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-06-01 08:18:30 -07:00
Paul E. McKenney 9aaffc898f prohibit rcutorture from being compiled into the kernel
There have been a number of instances where people have accidentally compiled
rcutorture into the kernel (CONFIG_RCU_TORTURE_TEST=y), which has never been
useful, and has often resulted in great frustration.

The attached patch prohibits rcutorture from being compiled into the
kernel.  It may be excluded altogether or compiled as a module.  People
wishing to have rcutorture hammer their machine immediately upon boot
are free to hand-edit lib/Kconfig.debug to remove the "depends on m"
line.

Thanks to Randy Dunlap for the trick that makes this work.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Josh Triplett <josh@kernel.org>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-23 20:14:12 -07:00
Alexey Dobriyan e8edc6e03a Detach sched.h from mm.h
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
   getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
   they don't need sched.h
b) sched.h stops being dependency for significant number of files:
   on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
   after patch it's only 3744 (-8.3%).

Cross-compile tested on

	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
	alpha alpha-up
	arm
	i386 i386-up i386-defconfig i386-allnoconfig
	ia64 ia64-up
	m68k
	mips
	parisc parisc-up
	powerpc powerpc-up
	s390 s390-up
	sparc sparc-up
	sparc64 sparc64-up
	um-x86_64
	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21 09:18:19 -07:00
Akinobu Mita 6d690dcac9 fault injection: disable stacktrace filter for x86-64
Disable stacktrace filter support for x86-64 for now.  Will be enable when we
can get the dwarf2 unwinder back.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-12 10:55:39 -07:00
Linus Torvalds 853da00220 Merge branch 'audit.b38' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b38' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] Abnormal End of Processes
  [PATCH] match audit name data
  [PATCH] complete message queue auditing
  [PATCH] audit inode for all xattr syscalls
  [PATCH] initialize name osid
  [PATCH] audit signal recipients
  [PATCH] add SIGNAL syscall class (v3)
  [PATCH] auditing ptrace
2007-05-11 09:57:16 -07:00
Randy Dunlap 99eaf3c45f lib/hexdump
Based on ace_dump_mem() from Grant Likely for the Xilinx SystemACE
CompactFlash interface.

Add print_hex_dump() & hex_dumper() to lib/hexdump.c and linux/kernel.h.

This patch adds the functions print_hex_dump() & hex_dumper().
print_hex_dump() can be used to perform a hex + ASCII dump of data to
syslog, in an easily viewable format, thus providing a common text hex dump
format.

hex_dumper() provides a dump-to-memory function.  It converts one "line" of
output (16 bytes of input) at a time.

Example usages:
	print_hex_dump(KERN_DEBUG, DUMP_PREFIX_ADDRESS, frame->data, frame->len);
	hex_dumper(frame->data, frame->len, linebuf, sizeof(linebuf));

Example output using %DUMP_PREFIX_OFFSET:
0009ab42: 40414243 44454647 48494a4b 4c4d4e4f-@ABCDEFG HIJKLMNO
Example output using %DUMP_PREFIX_ADDRESS:
ffffffff88089af0: 70717273 74757677 78797a7b 7c7d7e7f-pqrstuvw xyz{|}~.

[akpm@linux-foundation.org: cleanups, add export]
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11 08:29:34 -07:00
Amy Griffis e54dc2431d [PATCH] audit signal recipients
When auditing syscalls that send signals, log the pid and security
context for each target process. Optimize the data collection by
adding a counter for signal-related rules, and avoiding allocating an
aux struct unless we have more than one target process. For process
groups, collect pid/context data in blocks of 16. Move the
audit_signal_info() hook up in check_kill_permission() so we audit
attempts where permission is denied.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-05-11 05:38:25 -04:00
Amy Griffis 7f13da40e3 [PATCH] add SIGNAL syscall class (v3)
Add a syscall class for sending signals.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-05-11 05:38:25 -04:00
Linus Torvalds 9b6a51746f Merge branch 'juju' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
* 'juju' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: (138 commits)
  firewire: Convert OHCI driver to use standard goto unwinding for error handling.
  firewire: Always use parens with sizeof.
  firewire: Drop single buffer request support.
  firewire: Add a comment to describe why we split the sg list.
  firewire: Return SCSI_MLQUEUE_HOST_BUSY for out of memory cases in queuecommand.
  firewire: Handle the last few DMA mapping error cases.
  firewire: Allocate scsi_host up front and allocate the sbp2_device as hostdata.
  firewire: Provide module aliase for backwards compatibility.
  firewire: Add to fw-core-y instead of assigning fw-core-objs in Makefile.
  firewire: Break out shared IEEE1394 constant to separate header file.
  firewire: Use linux/*.h instead of asm/*.h header files.
  firewire: Uppercase most macro names.
  firewire: Coding style cleanup: no spaces after function names.
  firewire: Convert card_rwsem to a regular mutex.
  firewire: Clean up comment style.
  firewire: Use lib/ implementation of CRC ITU-T.
  CRC ITU-T V.41
  firewire: Rename fw-device-cdev.c to fw-cdev.c and move header to include/linux.
  firewire: Future proof the iso ioctls by adding a handle for the iso context.
  firewire: Add read/write and size annotations to IOC numbers.
  ...

Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-10 13:30:08 -07:00
Ivo van Doorn 3e7cbae7c6 CRC ITU-T V.41
This will add the CRC calculation according
to the CRC ITU-T V.41 to the kernel lib/ folder.

This code has been derived from the rt2x00 driver,
currently found only in the wireless-dev tree, but
this library is generic and could be used by more
drivers who currently use their own implementation.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>

Also useful for the new firewire stack.

Signed-off-by: Kristian Hoegsberg <krh@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2007-05-10 18:24:13 +02:00
Linus Torvalds ba7cc09c9c Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6: (21 commits)
  [MTD] [CHIPS] Remove MTD_OBSOLETE_CHIPS (jedec, amd_flash, sharp)
  [MTD] Delete allegedly obsolete "bank_size" field of mtd_info.
  [MTD] Remove unnecessary user space check from mtd.h.
  [MTD] [MAPS] Remove flash maps for no longer supported 405LP boards
  [MTD] [MAPS] Fix missing printk() parameter in physmap_of.c MTD driver
  [MTD] [NAND] platform NAND driver: add driver
  [MTD] [NAND] platform NAND driver: update header
  [JFFS2] Simplify and clean up jffs2_add_tn_to_tree() some more.
  [JFFS2] Remove another bogus optimisation in jffs2_add_tn_to_tree()
  [JFFS2] Remove broken insert_point optimisation in jffs2_add_tn_to_tree()
  [JFFS2] Remember to calculate overlap on nodes which replace older nodes
  [JFFS2] Don't advance c->wbuf_ofs to next eraseblock after wbuf flush
  [MTD] [NAND] at91_nand.c: CMDLINE_PARTS support
  [MTD] [NAND] Tidy up handling of page number in nand_block_bad()
  [MTD] block2mtd_paramline[] mustn't be __initdata
  [MTD] [NAND] Support multiple chips in CAFÉ driver
  [MTD] [NAND] Rename cafe.c to cafe_nand.c and remove the multi-obj magic
  [MTD] [NAND] Use rslib for CAFÉ ECC
  [RSLIB] Support non-canonical GF representations
  [JFFS2] Remove dead file histo_mips.h
  ...
2007-05-09 13:10:11 -07:00
Rafael J. Wysocki 8bb7844286 Add suspend-related notifications for CPU hotplug
Since nonboot CPUs are now disabled after tasks and devices have been
frozen and the CPU hotplug infrastructure is used for this purpose, we need
special CPU hotplug notifications that will help the CPU-hotplug-aware
subsystems distinguish normal CPU hotplug events from CPU hotplug events
related to a system-wide suspend or resume operation in progress.  This
patch introduces such notifications and causes them to be used during
suspend and resume transitions.  It also changes all of the
CPU-hotplug-aware subsystems to take these notifications into consideration
(for now they are handled in the same way as the corresponding "normal"
ones).

[oleg@tv-sign.ru: cleanups]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:56 -07:00
Alistair John Strachan 794543a236 Move LOG_BUF_SHIFT to a more sensible place
Several people have observed that perhaps LOG_BUF_SHIFT should be in a more
obvious place than under DEBUG_KERNEL. Under some circumstances (such as the
PARISC architecture), DEBUG_KERNEL can increase kernel size, which is an
undesirable trade off for something as trivial as increasing the kernel log
buffer size.

Instead, move LOG_BUF_SHIFT into "General Setup", so that people are more
likely to be able to change it such a circumstance that the default buffer
size is insufficient.

Signed-off-by: Alistair John Strachan <s0348365@sms.ed.ac.uk>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:14 -07:00
Johannes Berg c6b40d16d1 fix sscanf %n match at end of input string
I was playing with some code that sometimes got a string where a %n
match should have been done where the input string ended, for example
like this:

  sscanf("abc123", "abc%d%n", &a, &n);  /* doesn't work */
  sscanf("abc123a", "abc%d%n", &a, &n); /* works */

However, the scanf function in the kernel doesn't convert the %n in that
case because it has already matched the complete input after %d and just
completely stops matching then. This patch fixes that.

[akpm@linux-foundation.org: cleanups]
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:05 -07:00
Robert P. J. Day c761c84154 kconfig: centralize the selection of semaphore debugging in lib/Kconfig.debug
Remove the Kconfig selection of semaphore debugging from the ALPHA and FRV
Kconfig files, and centralize it in lib/Kconfig.debug.

There doesn't seem to be much point in letting individual architectures
independently define the same Kconfig option when it can just as easily be
put in a single Kconfig file and made dependent on a subset of
architectures.  that way, at least the option shows up in the same relative
location in the menu each time.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:00 -07:00
Sam Ravnborg b2ead6e012 fix section mismatch warning in lib/swiotlb.c
kbuild spits outs following warning on a
defconfig x86_64 build:
WARNING: swiotlb.o - Section mismatch: reference to .init.text:swiotlb_init from __ksymtab between '__ksymtab_swiotlb_init' (at offset 0xa0) and '__ksymtab_swiotlb_free_coherent'

This warning happens because the function swiotlb_init is marked __init and
EXPORT_SYMBOL().  A 'git grep swiotlb_init' showed no users in drivers/ so
remove the EXPORT_SYMBOL.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:59 -07:00
Christoph Hellwig ab1b6f03a1 simplify the stacktrace code
Simplify the stacktrace code:

 - remove the unused task argument to save_stack_trace, it's always
   current
 - remove the all_contexts flag, it's alwasy 0

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@suse.de>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:58 -07:00
Richard Purdie f0ac675806 Fix ppp_deflate issues with recent zlib_inflate changes
The last zlib_inflate update broke certain corner cases for ppp_deflate
decompression handling.  This patch fixes some logic to make things work
properly again.  Users other than ppp_deflate (the only Z_PACKET_FLUSH
user) should be unaffected.

Fixes bug 8405 (confirmed by Stefan)

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Cc: Stefan Wenk <stefan.wenk@gmx.at>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:13:04 -07:00
Bryan Wu 1394f03221 blackfin architecture
This adds support for the Analog Devices Blackfin processor architecture, and
currently supports the BF533, BF532, BF531, BF537, BF536, BF534, and BF561
(Dual Core) devices, with a variety of development platforms including those
avaliable from Analog Devices (BF533-EZKit, BF533-STAMP, BF537-STAMP,
BF561-EZKIT), and Bluetechnix!  Tinyboards.

The Blackfin architecture was jointly developed by Intel and Analog Devices
Inc.  (ADI) as the Micro Signal Architecture (MSA) core and introduced it in
December of 2000.  Since then ADI has put this core into its Blackfin
processor family of devices.  The Blackfin core has the advantages of a clean,
orthogonal,RISC-like microprocessor instruction set.  It combines a dual-MAC
(Multiply/Accumulate), state-of-the-art signal processing engine and
single-instruction, multiple-data (SIMD) multimedia capabilities into a single
instruction-set architecture.

The Blackfin architecture, including the instruction set, is described by the
ADSP-BF53x/BF56x Blackfin Processor Programming Reference
http://blackfin.uclinux.org/gf/download/frsrelease/29/2549/Blackfin_PRM.pdf

The Blackfin processor is already supported by major releases of gcc, and
there are binary and source rpms/tarballs for many architectures at:
http://blackfin.uclinux.org/gf/project/toolchain/frs There is complete
documentation, including "getting started" guides available at:
http://docs.blackfin.uclinux.org/ which provides links to the sources and
patches you will need in order to set up a cross-compiling environment for
bfin-linux-uclibc

This patch, as well as the other patches (toolchain, distribution,
uClibc) are actively supported by Analog Devices Inc, at:
http://blackfin.uclinux.org/

We have tested this on LTP, and our test plan (including pass/fails) can
be found at:
http://docs.blackfin.uclinux.org/doku.php?id=testing_the_linux_kernel

[m.kozlowski@tuxland.pl: balance parenthesis in blackfin header files]
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Aubrey Li <aubrey.li@analog.com>
Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:58 -07:00
Heiko Carstens 411f0f3edc Introduce CONFIG_HAS_DMA
Architectures that don't support DMA can say so by adding a config NO_DMA
to their Kconfig file.  This will prevent compilation of some dma specific
driver code.  Also dma-mapping-broken.h isn't needed anymore on at least
s390.  This avoids compilation and linking of otherwise dead/broken code.

Other architectures that include dma-mapping-broken.h are arm26, h8300,
m68k, m68knommu and v850.  If these could be converted as well we could get
rid of the header file.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
"John W. Linville" <linville@tuxdriver.com>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: <James.Bottomley@SteelEye.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <geert@linux-m68k.org>
Cc: <zippel@linux-m68k.org>
Cc: <spyro@f2s.com>
Cc: <uclinux-v850@lsi.nec.co.jp>
Cc: <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:51 -07:00
Christoph Lameter 476f35348e Safer nr_node_ids and nr_node_ids determination and initial values
The nr_cpu_ids value is currently only calculated in smp_init.  However, it
may be needed before (SLUB needs it on kmem_cache_init!) and other kernel
components may also want to allocate dynamically sized per cpu array before
smp_init.  So move the determination of possible cpus into sched_init()
where we already loop over all possible cpus early in boot.

Also initialize both nr_node_ids and nr_cpu_ids with the highest value they
could take.  If we have accidental users before these values are determined
then the current valud of 0 may cause too small per cpu and per node arrays
to be allocated.  If it is set to the maximum possible then we only waste
some memory for early boot users.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:51 -07:00
Linus Torvalds 15700770ef Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (38 commits)
  kconfig: fix mconf segmentation fault
  kbuild: enable use of code from a different dir
  kconfig: error out if recursive dependencies are found
  kbuild: scripts/basic/fixdep segfault on pathological string-o-death
  kconfig: correct minor typo in Kconfig warning message.
  kconfig: fix path to modules.txt in Kconfig help
  usr/Kconfig: fix typo
  kernel-doc: alphabetically-sorted entries in index.html of 'htmldocs'
  kbuild: be more explicit on missing .config file
  kbuild: clarify the creation of the LOCALVERSION_AUTO string.
  kbuild: propagate errors from find in scripts/gen_initramfs_list.sh
  kconfig: refer to qt3 if we cannot find qt libraries
  kbuild: handle compressed cpio initramfs-es
  kbuild: ignore section mismatch warning for references from .paravirtprobe to .init.text
  kbuild: remove stale comment in modpost.c
  kbuild/mkuboot.sh: allow spaces in CROSS_COMPILE
  kbuild: fix make mrproper for Documentation/DocBook/man
  kbuild: remove kconfig binaries during make mrproper
  kconfig/menuconfig: do not hardcode '.config'
  kbuild: override build timestamp & version
  ...
2007-05-06 13:21:57 -07:00