Commit graph

914 commits

Author SHA1 Message Date
Rusty Russell
2ca1a61583 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	arch/x86/kernel/io_apic.c
2008-12-31 23:05:57 +10:30
Linus Torvalds
ab70537c32 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: struct device - replace bus_id with dev_name()
  lguest: move the initial guest page table creation code to the host
  kvm-s390: implement config_changed for virtio on s390
  virtio_console: support console resizing
  virtio: add PCI device release() function
  virtio_blk: fix type warning
  virtio: block: dynamic maximum segments
  virtio: set max_segment_size and max_sectors to infinite.
  virtio: avoid implicit use of Linux page size in balloon interface
  virtio: hand virtio ring alignment as argument to vring_new_virtqueue
  virtio: use KVM_S390_VIRTIO_RING_ALIGN instead of relying on pagesize
  virtio: use LGUEST_VRING_ALIGN instead of relying on pagesize
  virtio: Don't use PAGE_SIZE for vring alignment in virtio_pci.
  virtio: rename 'pagesize' arg to vring_init/vring_size
  virtio: Don't use PAGE_SIZE in virtio_pci.c
  virtio: struct device - replace bus_id with dev_name(), dev_set_name()
  virtio-pci queue allocation not page-aligned
2008-12-30 17:37:25 -08:00
Rusty Russell
db40598863 virtio: use KVM_S390_VIRTIO_RING_ALIGN instead of relying on pagesize
This doesn't really matter, since s390 pagesize is 4k anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
2008-12-30 09:26:03 +10:30
Rusty Russell
33edcf133b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-12-30 08:02:35 +10:30
Jens Axboe
abf137dd77 aio: make the lookup_ioctx() lockless
The mm->ioctx_list is currently protected by a reader-writer lock,
so we always grab that lock on the read side for doing ioctx
lookups. As the workload is extremely reader biased, turn this into
an rcu hlist so we can make lookup_ioctx() lockless. Get rid of
the rwlock and use a spinlock for providing update side exclusion.

There's usually only 1 entry on this list, so it doesn't make sense
to look into fancier data structures.

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:50 +01:00
Linus Torvalds
0191b625ca Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits)
  net: Allow dependancies of FDDI & Tokenring to be modular.
  igb: Fix build warning when DCA is disabled.
  net: Fix warning fallout from recent NAPI interface changes.
  gro: Fix potential use after free
  sfc: If AN is enabled, always read speed/duplex from the AN advertising bits
  sfc: When disabling the NIC, close the device rather than unregistering it
  sfc: SFT9001: Add cable diagnostics
  sfc: Add support for multiple PHY self-tests
  sfc: Merge top-level functions for self-tests
  sfc: Clean up PHY mode management in loopback self-test
  sfc: Fix unreliable link detection in some loopback modes
  sfc: Generate unique names for per-NIC workqueues
  802.3ad: use standard ethhdr instead of ad_header
  802.3ad: generalize out mac address initializer
  802.3ad: initialize ports LACPDU from const initializer
  802.3ad: remove typedef around ad_system
  802.3ad: turn ports is_individual into a bool
  802.3ad: turn ports is_enabled into a bool
  802.3ad: make ntt bool
  ixgbe: Fix set_ringparam in ixgbe to use the same memory pools.
  ...

Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due
to the conversion to %pI (in this networking merge) and the addition of
doing IPv6 addresses (from the earlier merge of CIFS).
2008-12-28 12:49:40 -08:00
Linus Torvalds
1db2a5c11e Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (85 commits)
  [S390] provide documentation for hvc_iucv kernel parameter.
  [S390] convert ctcm printks to dev_xxx and pr_xxx macros.
  [S390] convert zfcp printks to pr_xxx macros.
  [S390] convert vmlogrdr printks to pr_xxx macros.
  [S390] convert zfcp dumper printks to pr_xxx macros.
  [S390] convert cpu related printks to pr_xxx macros.
  [S390] convert qeth printks to dev_xxx and pr_xxx macros.
  [S390] convert sclp printks to pr_xxx macros.
  [S390] convert iucv printks to dev_xxx and pr_xxx macros.
  [S390] convert ap_bus printks to pr_xxx macros.
  [S390] convert dcssblk and extmem printks messages to pr_xxx macros.
  [S390] convert monwriter printks to pr_xxx macros.
  [S390] convert s390 debug feature printks to pr_xxx macros.
  [S390] convert monreader printks to pr_xxx macros.
  [S390] convert appldata printks to pr_xxx macros.
  [S390] convert setup printks to pr_xxx macros.
  [S390] convert hypfs printks to pr_xxx macros.
  [S390] convert time printks to pr_xxx macros.
  [S390] convert cpacf printks to pr_xxx macros.
  [S390] convert cio printks to pr_xxx macros.
  ...
2008-12-28 12:33:21 -08:00
Linus Torvalds
a39b863342 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
  sched: fix warning in fs/proc/base.c
  schedstat: consolidate per-task cpu runtime stats
  sched: use RCU variant of list traversal in for_each_leaf_rt_rq()
  sched, cpuacct: export percpu cpuacct cgroup stats
  sched, cpuacct: refactoring cpuusage_read / cpuusage_write
  sched: optimize update_curr()
  sched: fix wakeup preemption clock
  sched: add missing arch_update_cpu_topology() call
  sched: let arch_update_cpu_topology indicate if topology changed
  sched: idle_balance() does not call load_balance_newidle()
  sched: fix sd_parent_degenerate on non-numa smp machine
  sched: add uid information to sched_debug for CONFIG_USER_SCHED
  sched: move double_unlock_balance() higher
  sched: update comment for move_task_off_dead_cpu
  sched: fix inconsistency when redistribute per-cpu tg->cfs_rq shares
  sched/rt: removed unneeded defintion
  sched: add hierarchical accounting to cpu accounting controller
  sched: include group statistics in /proc/sched_debug
  sched: rename SCHED_NO_NO_OMIT_FRAME_POINTER => SCHED_OMIT_FRAME_POINTER
  sched: clean up SCHED_CPUMASK_ALLOC
  ...
2008-12-28 12:27:58 -08:00
Rusty Russell
9be3eec2c8 cpumask: cpu_coregroup_mask(): s390
Like cpu_coregroup_map, but returns a (const) pointer.

Compile-tested on s390 (defconfig).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
2008-12-26 22:23:42 +10:30
Martin Schwidefsky
395d31d40c [S390] convert cpu related printks to pr_xxx macros.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:25 +01:00
Hongjie Yang
93098bf015 [S390] convert dcssblk and extmem printks messages to pr_xxx macros.
Signed-off-by: Hongjie Yang <hongjie@us.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:23 +01:00
Michael Holzheu
c5612c1956 [S390] convert s390 debug feature printks to pr_xxx macros.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:22 +01:00
Gerald Schaefer
e7534b0ec9 [S390] convert appldata printks to pr_xxx macros.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:22 +01:00
Martin Schwidefsky
3b6ed4ab48 [S390] convert setup printks to pr_xxx macros.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:21 +01:00
Michael Holzheu
f55495ba1a [S390] convert hypfs printks to pr_xxx macros.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:21 +01:00
Martin Schwidefsky
feab6501d8 [S390] convert time printks to pr_xxx macros.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:20 +01:00
Jan Glauber
39f0939249 [S390] convert cpacf printks to pr_xxx macros.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:20 +01:00
Christian Borntraeger
2f526e5acb [S390] convert cpcmd printks to pr_xxx macros.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:19 +01:00
Julia Lawall
acfa922c5a [S390] s390: Remove redundant test
The loop above the modified code only terminates when rc is a valid pointer.

A simplified version of the semantic patch that makes this change is as
follows: (http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r exists@
local idexpression x;
expression E;
position p1,p2;
@@

if (x@p1 == NULL || ...) { ... when forall
   return ...; }
... when != \(x=E\|x--\|x++\|--x\|++x\|x-=E\|x+=E\|x|=E\|x&=E\|&x\)
(
x@p2 == NULL
|
x@p2 != NULL
)

// another path to the test that is not through p1?
@s exists@
local idexpression r.x;
position r.p1,r.p2;
@@

... when != x@p1
(
x@p2 == NULL
|
x@p2 != NULL
)

@fix depends on !s@
position r.p1,r.p2;
expression x,E;
statement S1,S2;
@@

(
- if ((x@p2 != NULL) || ...)
  S1
|
- if ((x@p2 == NULL) && ...) S1
|
- BUG_ON(x@p2 == NULL);
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:17 +01:00
Hendrik Brueckner
0946100f56 [S390] s390/setup: set default preferred console device "ttyS"
This patch sets the default console device for s390.

The console= kernel parameter can be still used to switch the preferred
console to some other device. In that case, console messages are also
printed on the default console device (ttyS0).

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:17 +01:00
Martin Schwidefsky
33b1d09ef3 [S390] panic_stack leak in smp_alloc_lowcore
Fix freeing of the panic_stack if the allocation of async_stack failed.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:16 +01:00
Martin Schwidefsky
4f7e90d6d6 [S390] clear_table inline assembly contraints
Tell the compile that the clear_table inline assembly writes to the
memory referenced by *s.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:15 +01:00
Martin Schwidefsky
c185b783b0 [S390] Remove config options.
On s390 we always want to run with precise cputime accounting.
Remove the config options VIRT_TIMER and VIRT_CPU_ACCOUNTING.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:15 +01:00
Heiko Carstens
349f1b671a [S390] cpu topology: remove dead code
Interrupts haven't been implemented. So remove the dead code.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:15 +01:00
Heiko Carstens
2b1a61f0a8 [S390] cpu topology: introduce kernel parameter
Introduce a topology=[on|off] kernel parameter which allows to switch
cpu topology on/off. Default will be off, since it looks like that for
some workloards this doesn't behave very well (on s390).

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:14 +01:00
Martin Schwidefsky
9fee8db222 [S390] add new machine types to setup_hwcaps.
Add the machine types for z9-bc, z10-ec and z10-bc to the elf_platform
detection in setup_hwcaps.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:14 +01:00
Heiko Carstens
c58d92b233 [S390] Remove initial kernel stack backchain initialization.
Early init code clears the backchain of the initial kernel stack frame.
This is not necessary since it is pre initialized with zeros. Plus it
was broken on 64 bit since it cleared only four of eight bytes.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:13 +01:00
Harvey Harrison
64253acbf1 [S390] s390: use the new byteorder headers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:13 +01:00
Martin Schwidefsky
e37f50e181 [S390] Add processor type march=z10 and a processor type safety check.
This patch adds the code generation option for IBM System z10 and
adds a check in head[31,64].S to prevents the execution of a kernel
compiled for a new processor type on an old machine.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:12 +01:00
Martin Schwidefsky
2d6cd2a590 [S390] remove warnings with functions ending in BUG
Functions which end in a BUG() statement and skip the return statement
cause compile warnings on s390, e.g.:

mm/bootmem.c: In function 'mark_bootmem':
mm/bootmem.c:321: warning: control reaches end of non-void function

To avoid the warning add an endless loop to the BUG() macro.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:12 +01:00
Heiko Carstens
edd5378740 [S390] mark disabled_wait as noreturn function
disabled_wait() won't return, so add an __attribute__((noreturn)).
This will remove a false positive finding which our internal code
checker reports.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:11 +01:00
Peter Oberparleiter
d7b604891b [S390] cio: update sac values
Values for the sac field have changed - update code accordingly.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:06 +01:00
Martin Schwidefsky
547e3cec4f [S390] remove ptrace warning on 31 bit.
A kernel compile on 31 bit gives the following warnings in ptrace.c:

arch/s390/kernel/ptrace.c: In function 'peek_user':
arch/s390/kernel/ptrace.c:207: warning: unused variable 'dummy'
arch/s390/kernel/ptrace.c: In function 'poke_user':
arch/s390/kernel/ptrace.c:315: warning: unused variable 'dummy'

Getting rid of the dummy variables removes the warnings.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:05 +01:00
Heiko Carstens
5d360a75f8 [S390] ftrace: function tracer backend for s390
This implements just the basic function tracer (_mcount) backend for s390.
The dynamic variant will come later.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:03 +01:00
Martin Schwidefsky
6bcac508fb [S390] service level interface.
Add a new proc interface /proc/service_levels that allows any code
to report a relevant service level, e.g. the microcode level of
devices, the service level of the hypervisor, etc.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:00 +01:00
Jan Glauber
22f9934767 [S390] qdio: rework debug feature logging
- make qdio_trace a per device view
- remove s390dbf exceptions
- remove CONFIG_QDIO_DEBUG, not needed anymore if we check for the level
  before calling sprintf
- use snprintf for dbf entries
- add start markers to see if the dbf view wrapped
- add a global error view for all queues

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:59 +01:00
Jan Glauber
bbd50e172f [S390] qdio: fix qeth port count detection
qeth needs to get the port count information before
qdio has allocated a page for the chsc operation.
Extend qdio_get_ssqd_desc() to store the data in the
specified structure.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:58 +01:00
Felix Beck
cb17a6364a [S390] zcrypt: Use of Thin Interrupts
When the machine supports AP adapter interrupts polling will be
switched off at module initialization and the driver will work in
interrupt mode.

Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:57 +01:00
Heiko Carstens
320c04c068 [S390] Move stfle to header file.
stfle will be needed by the ap_bus module to figure out wether the AP
queue adapter interruption facility is installed.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:56 +01:00
Heiko Carstens
ca9fc75a68 [S390] convert s390 to generic IPI infrastructure
Since etr/stp don't need the old smp_call_function semantics anymore
we can convert s390 to the generic IPI infrastructure.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:56 +01:00
Martin Schwidefsky
0b3016b781 [S390] serialize stp/etr work
The work function dispatched with schedule_work() can be run twice
on different cpus because run_workqueue clears the WORK_STRUCT_PENDING
bit and then executes the function. Another cpu can call schedule_work()
again and run the work function a second time before the first call
is completed. This patch serialized the etr and stp work function with
a mutex.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:56 +01:00
Heiko Carstens
750887dedc [S390] convert etr/stp to stop_machine interface
This converts the etr and stp code to the new stop_machine interface
which allows to synchronize all cpus without allocating any memory.
This way we get rid of the only reason why we haven't converted s390
to the generic IPI interface yet.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:55 +01:00
Martin Schwidefsky
b020632e40 [S390] introduce vdso on s390
Add a vdso to speed up gettimeofday and clock_getres/clock_gettime for
CLOCK_REALTIME/CLOCK_MONOTONIC.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:55 +01:00
Heiko Carstens
f414f5f153 [S390] cpu topology: dont destroy cpu sets on topology change
Call rebuild_sched_domains instead of arch_reinit_sched_domains if
cpu topology changes. This leaves cpu sets alone which otherwise would
be destroyed.
If and how it makes sense to define cpu sets on a virtualized
architecture is another question.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:07 +01:00
Al Viro
8f2961c39e [S390] audit: get s390 ret_from_fork in sync with other architectures
On s390 we have ret_from_fork jump not to the "do all work we
normally do on return from syscall" as on x86, ppc, etc., but to the
"do all such work except audit".  Historical reasons - the codepath
triggered when we have AUDIT process flag set is separated from the
normall one and they converge at sysc_return, which is the common
part of post-syscall work.  And does not include calling audit_syscall_exit() -
that's done in the end of sysc_tracesys path, just before that path jumps
to sysc_return.

	IOW, the child returning from fork()/clone()/vfork() doesn't
call audit_syscall_exit() at all, so no matter what we do with its
audit context, we are not going to see the audit entry.

	The fix is simple: have ret_from_fork go to the point just past
the call of sys_.... in the 'we have AUDIT flag set' path.  There we
have (64bit variant; for 31bit the situation is the same):
sysc_tracenogo:
        tm      __TI_flags+7(%r9),(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT)
        jz      sysc_return
        la      %r2,SP_PTREGS(%r15)     # load pt_regs
        larl    %r14,sysc_return        # return point is sysc_return
        jg      do_syscall_trace_exit
which is precisely what we need - check the flag, bugger off to sysc_return
if not set, otherwise call do_syscall_trace_exit() and bugger off to
sysc_return.  r9 has just been properly set by ret_from_fork itself,
so we are fine.

	Tested on s390x, seems to work fine.  WARNING: it's been about
16 years since my last contact with 3X0 assembler[1], so additional
review would be very welcome.  I don't think I've managed to screw it
up, but...

[1] that *was* in another country and besides, the box is dead...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:07 +01:00
Heiko Carstens
5439050f9f [S390] cpu topology: fix cpu_core_map initialization
Common code doesn't call arch_update_cpu_topology() anymore on
cpu hotplug. But our architecture backend relied on that in order to
update the cpu_core_map. For machines without cpu topology support
this leads uninitialized cpu_core_maps for later on added cpus.

To solve this just initialize the maps with cpu_possible_map, since
that will be always valid for machines without topology support.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:07 +01:00
Rusty Russell
968ea6d80e Merge ../linux-2.6-x86
Conflicts:

	arch/x86/kernel/io_apic.c
	kernel/sched.c
	kernel/sched_stats.h
2008-12-13 21:55:51 +10:30
Rusty Russell
320ab2b0b1 cpumask: convert struct clock_event_device to cpumask pointers.
Impact: change calling convention of existing clock_event APIs

struct clock_event_timer's cpumask field gets changed to take pointer,
as does the ->broadcast function.

Another single-patch change.  For safety, we BUG_ON() in
clockevents_register_device() if it's not set.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
2008-12-13 21:20:26 +10:30
Rusty Russell
98a79d6a50 cpumask: centralize cpu_online_map and cpu_possible_map
Impact: cleanup

Each SMP arch defines these themselves.  Move them to a central
location.

Twists:
1) Some archs (m32, parisc, s390) set possible_map to all 1, so we add a
   CONFIG_INIT_ALL_POSSIBLE for this rather than break them.

2) mips and sparc32 '#define cpu_possible_map phys_cpu_present_map'.
   Those archs simply have phys_cpu_present_map replaced everywhere.

3) Alpha defined cpu_possible_map to cpu_present_map; this is tricky
   so I just manipulate them both in sync.

4) IA64, cris and m32r have gratuitous 'extern cpumask_t cpu_possible_map'
   declarations.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Tested-by: Tony Luck <tony.luck@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Mike Travis <travis@sgi.com>
Cc: ink@jurassic.park.msu.ru
Cc: rmk@arm.linux.org.uk
Cc: starvik@axis.com
Cc: tony.luck@intel.com
Cc: takata@linux-m32r.org
Cc: ralf@linux-mips.org
Cc: grundler@parisc-linux.org
Cc: paulus@samba.org
Cc: schwidefsky@de.ibm.com
Cc: lethal@linux-sh.org
Cc: wli@holomorphy.com
Cc: davem@davemloft.net
Cc: jdike@addtoit.com
Cc: mingo@redhat.com
2008-12-13 21:19:41 +10:30
Heiko Carstens
ee79d1bdb6 sched: let arch_update_cpu_topology indicate if topology changed
Change arch_update_cpu_topology so it returns 1 if the cpu topology changed
and 0 if it didn't change. This will be useful for the next patch which adds
a call to this function in partition_sched_domains.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-12 13:47:21 +01:00
James Morris
ec98ce480a Merge branch 'master' into next
Conflicts:
	fs/nfsd/nfs4recover.c

Manually fixed above to use new creds API functions, e.g.
nfs4_save_creds().

Signed-off-by: James Morris <jmorris@namei.org>
2008-12-04 17:16:36 +11:00
David S. Miller
aa2ba5f108 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/ixgbe/ixgbe_main.c
	drivers/net/smc91x.c
2008-12-02 19:50:27 -08:00
Linus Torvalds
b7d6266062 Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: MMU: avoid creation of unreachable pages in the shadow
  KVM: ppc: stop leaking host memory on VM exit
  KVM: MMU: fix sync of ptes addressed at owner pagetable
  KVM: ia64: Fix: Use correct calling convention for PAL_VPS_RESUME_HANDLER
  KVM: ia64: Fix incorrect kbuild CFLAGS override
  KVM: VMX: Fix interrupt loss during race with NMI
  KVM: s390: Fix problem state handling in guest sigp handler
2008-12-02 15:56:17 -08:00
Linus Torvalds
f1ba3bc7b9 Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] Update default configuration.
  [S390] Fix alignment of initial kernel stack.
  [S390] pgtable.h: Fix oops in unmap_vmas for KVM processes
  [S390] fix/cleanup sched_clock
  [S390] fix system call parameter functions.
2008-11-30 11:07:16 -08:00
Christoph Hellwig
96b8936a9e remove __ARCH_WANT_COMPAT_SYS_PTRACE
All architectures now use the generic compat_sys_ptrace, as should every
new architecture that needs 32bit compat (if we'll ever get another).

Remove the now superflous __ARCH_WANT_COMPAT_SYS_PTRACE define, and also
kill a comment about __ARCH_SYS_PTRACE that was added after
__ARCH_SYS_PTRACE was already gone.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30 11:00:15 -08:00
Martin Schwidefsky
abd942194d [S390] Update default configuration.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27 11:06:58 +01:00
Heiko Carstens
0778dc3a62 [S390] Fix alignment of initial kernel stack.
We need an alignment of 16384 bytes for the initial kernel stack if
the kernel is configured for 16384 bytes stacks but the linker script
currently guarantees only an alignment of 8192 bytes.

So fix this and simply use THREAD_SIZE as alignment value which will
always do the right thing.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27 11:06:58 +01:00
Christian Borntraeger
2944a5c971 [S390] pgtable.h: Fix oops in unmap_vmas for KVM processes
When running several kvm processes with lots of memory overcommitment,
we have seen an oops during process shutdown:
------------[ cut here ]------------
Kernel BUG at 0000000000193434 [verbose debug info unavailable]
addressing exception: 0005 [#1] PREEMPT SMP
Modules linked in: kvm sunrpc qeth_l2 dm_mod qeth ccwgroup
CPU: 10 Not tainted 2.6.28-rc4-kvm-bigiron-00521-g0ccca08-dirty #8
Process kuli (pid: 14460, task: 0000000149822338, ksp: 0000000024f57650)
Krnl PSW : 0704e00180000000 0000000000193434 (unmap_vmas+0x884/0xf10)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3
Krnl GPRS: 0000000000000002 0000000000000000 000000051008d000 000003e05e6034e0
           00000000001933f6 00000000000001e9 0000000407259e0a 00000002be88c400
           00000200001c1000 0000000407259608 0000000407259e08 0000000024f577f0
           0000000407259e09 0000000000445fa8 00000000001933f6 0000000024f577f0
Krnl Code: 0000000000193426: eb22000c000d sllg %r2,%r2,12
           000000000019342c: a7180000 lhi %r1,0
           0000000000193430: b2290012 iske %r1,%r2
          >0000000000193434: a7110002 tmll %r1,2
           0000000000193438: a7840006 brc 8,193444
           000000000019343c: 9602c000 oi 0(%r12),2
           0000000000193440: 96806000 oi 0(%r6),128
           0000000000193444: a7110004 tmll %r1,4
Call Trace:
([<00000000001933f6>] unmap_vmas+0x846/0xf10)
[<0000000000199680>] exit_mmap+0x210/0x458
[<000000000012a8f8>] mmput+0x54/0xfc
[<000000000012f714>] exit_mm+0x134/0x144
[<000000000013120c>] do_exit+0x240/0x878
[<00000000001318dc>] do_group_exit+0x98/0xc8
[<000000000013e6b0>] get_signal_to_deliver+0x30c/0x358
[<000000000010bee0>] do_signal+0xec/0x860
[<0000000000112e30>] sysc_sigpending+0xe/0x22
[<000002000013198a>] 0x2000013198a
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<00000000001a68d0>] free_swap_and_cache+0x1a0/0x1a4
<4>---[ end trace bc19f1d51ac9db7c ]---

The faulting instruction is the storage key operation (iske) in
ptep_rcp_copy (called by pte_clear, called by unmap_vmas). iske
reads dirty and reference bit information for a physical page and
requires a valid physical address. Since we are in pte_clear, we
cannot rely on the pte containing a valid address. Fortunately we
dont need these information in pte_clear - after all there is no
mapping. The best fix is to remove the needless call to ptep_rcp_copy
that contains the iske.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27 11:06:57 +01:00
Christian Borntraeger
8107d8296b [S390] fix/cleanup sched_clock
CONFIG_PRINTK_TIME reveals that sched_clock has a wrong offset during boot:
..
[    0.000000]   Movable zone: 0 pages used for memmap
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 775679
[    0.000000] Kernel command line: dasd=4b6c root=/dev/dasda1 ro noinitrd
[    0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[6920575.975232] console [ttyS0] enabled
[6920575.987586] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
[6920575.991404] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
..

The s390 implementation of sched_clock uses the store clock instruction and
subtracts jiffies_timer_cc.
jiffies_timer_cc is a local variable in arch/s390/kernel/time.c and only used
for sched_clock and monotonic clock. For historical reasons there is an offset
on that value. With todays code this offset is unnecessary. By removing that
offset we can get a sched_clock which returns the nanoseconds after time_init.
This improves CONFIG_PRINTK_TIME.

Since sched_clock is the only user, I have also renamed jiffies_timer_cc to
sched_clock_base_cc. In addition, the local variable init_timer_cc is redundant
and can be romved as well.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27 11:06:57 +01:00
Martin Schwidefsky
59da21398e [S390] fix system call parameter functions.
syscall_get_nr() currently returns a valid result only if the call
chain of the traced process includes do_syscall_trace_enter(). But
collect_syscall() can be called for any sleeping task, the result of
syscall_get_nr() in general is completely bogus.

To make syscall_get_nr() work for any sleeping task the traps field
in pt_regs is replace with svcnr - the system call number the process
is executing. If svcnr == 0 the process is not on a system call path.

The syscall_get_arguments and syscall_set_arguments use regs->gprs[2]
for the first system call parameter. This is incorrect since gprs[2]
may have been overwritten with the system call number if the call
chain includes do_syscall_trace_enter. Use regs->orig_gprs2 instead.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27 11:06:56 +01:00
Christian Borntraeger
3eb77d5116 KVM: s390: Fix problem state handling in guest sigp handler
We can get an exit for instructions starting with 0xae, even if the guest is
in userspace. Lets make sure, that the signal processor handler is only called
in guest supervisor mode. Otherwise, send a program check.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-11-23 14:34:39 +02:00
Stephen Hemminger
eeda3fd64f netdev: introduce dev_get_stats()
In order for the network device ops get_stats call to be immutable, the handling
of the default internal network device stats block has to be changed. Add a new
helper function which replaces the old use of internal_get_stats.

Note: change return code to make it clear that the caller should not
go changing the returned statistics.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-19 21:40:23 -08:00
James Morris
f3a5c54701 Merge branch 'master' into next
Conflicts:
	fs/cifs/misc.c

Merge to resolve above, per the patch below.

Signed-off-by: James Morris <jmorris@namei.org>

diff --cc fs/cifs/misc.c
index ec36410,addd1dc..0000000
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@@ -347,13 -338,13 +338,13 @@@ header_assemble(struct smb_hdr *buffer
  		/*  BB Add support for establishing new tCon and SMB Session  */
  		/*      with userid/password pairs found on the smb session   */
  		/*	for other target tcp/ip addresses 		BB    */
 -				if (current->fsuid != treeCon->ses->linux_uid) {
 +				if (current_fsuid() != treeCon->ses->linux_uid) {
  					cFYI(1, ("Multiuser mode and UID "
  						 "did not match tcon uid"));
- 					read_lock(&GlobalSMBSeslock);
- 					list_for_each(temp_item, &GlobalSMBSessionList) {
- 						ses = list_entry(temp_item, struct cifsSesInfo, cifsSessionList);
+ 					read_lock(&cifs_tcp_ses_lock);
+ 					list_for_each(temp_item, &treeCon->ses->server->smb_ses_list) {
+ 						ses = list_entry(temp_item, struct cifsSesInfo, smb_ses_list);
 -						if (ses->linux_uid == current->fsuid) {
 +						if (ses->linux_uid == current_fsuid()) {
  							if (ses->server == treeCon->ses->server) {
  								cFYI(1, ("found matching uid substitute right smb_uid"));
  								buffer->Uid = ses->Suid;
2008-11-18 18:52:37 +11:00
Martin Schwidefsky
d2f019fe40 [S390] fix s390x_newuname
The uname system call for 64 bit compares current->personality without
masking the upper 16 bits. If e.g. READ_IMPLIES_EXEC is set the result
of a uname system call will always be s390x even if the process uses
the s390 personality.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-14 18:18:55 +01:00
Heiko Carstens
74af283102 [S390] cpu topology: fix locking
cpu_coregroup_map used to grab a mutex on s390 since it was only
called from process context.
Since c7c22e4d5c "block: add support
for IO CPU affinity" this is not true anymore.
It now also gets called from softirq context.

To prevent possible deadlocks change this in architecture code and
use a spinlock instead of a mutex.

Cc: stable@kernel.org
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-14 18:18:54 +01:00
Heiko Carstens
50bec4ce5d [S390] ftrace: fix kernel stack backchain walking
With CONFIG_IRQSOFF_TRACER the trace_hardirqs_off() function includes
a call to __builtin_return_address(1). But we calltrace_hardirqs_off()
from early entry code. There we have just a single stack frame.
So this results in a kernel stack backchain walk that would walk beyond
the kernel stack. Following the NULL terminated backchain this results
in a lowcore read access.

To fix this we simply call trace_hardirqs_off_caller() and pass the
current instruction pointer.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-14 18:18:53 +01:00
Heiko Carstens
632448f650 [S390] ftrace: disable tracing on idle psw
Disable tracing on idle psw. Otherwise it would give us huge
preempt off times for idle. Which is rather pointless.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-14 18:18:53 +01:00
Heiko Carstens
af4c68740e [S390] lockdep: fix compile bug
arch/s390/kernel/built-in.o: In function `cleanup_io_leave_insn':
mem_detect.c:(.text+0x10592): undefined reference to `lockdep_sys_exit'

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-14 18:18:53 +01:00
Gerald Schaefer
fb2e7c5e33 [S390] Fix range for add_active_range() in setup_memory()
add_active_range() expects start_pfn + size as end_pfn value, i.e. not
the pfn of the last page frame but the one behind that.
We used the pfn of the last page frame so far, which can lead to a
BUG_ON in move_freepages(), when the kernelcore parameter is specified
(page_zone(start_page) != page_zone(end_page)).

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-14 18:18:51 +01:00
David Howells
b6dff3ec5e CRED: Separate task security context from task_struct
Separate the task security context from task_struct.  At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:16 +11:00
David Howells
e542370532 CRED: Wrap task credential accesses in the S390 arch
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:39 +11:00
Christian Borntraeger
ea4bfdf52a [S390] s390: Fix build for !CONFIG_S390_GUEST + CONFIG_VIRTIO_CONSOLE
The s390 kernel does not compile if virtio console is enabled, but guest
support is disabled:

  LD      .tmp_vmlinux1
arch/s390/kernel/built-in.o: In function `setup_arch':
/space/linux-2.5/arch/s390/kernel/setup.c:773: undefined reference to
`s390_virtio_console_init'

The fix is related to
commit 99e65c92f2
Author: Christian Borntraeger <borntraeger@de.ibm.com>
Date:   Fri Jul 25 15:50:04 2008 +0200
    KVM: s390: Fix guest kconfig

Which changed the build process to build kvm_virtio.c only if CONFIG_S390_GUEST
is set. We must ifdef the prototype in the header file accordingly.

Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:06 +01:00
Heiko Carstens
7f5a8ba6b0 [S390] No more 4kb stacks.
We got a stack overflow with a small stack configuration on a 32 bit
system. It just looks like as 4kb isn't enough and too dangerous.
So lets get rid of 4kb stacks on 32 bit.

But one thing I completely dislike about the call trace below is that
just for debugging or tracing purposes sprintf gets called (cio_start_key):

	/* process condition code */
	sprintf(dbf_txt, "ccode:%d", ccode);
	CIO_TRACE_EVENT(4, dbf_txt);

But maybe its just me who thinks that this could be done better.

    <4>Kernel stack overflow.
    <4>Modules linked in: dm_multipath sunrpc bonding qeth_l2 dm_mod qeth ccwgroup vmur
    <4>CPU: 1 Not tainted 2.6.27-30.x.20081015-s390default #1
    <4>Process httpd (pid: 3807, task: 20ae2df8, ksp: 1666fb78)
    <4>Krnl PSW : 040c0000 8027098a (number+0xe/0x348)
    <4>           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0
    <4>Krnl GPRS: 00d43318 0027097c 1666f277 9666f270
    <4>           00000000 00000000 0000000a ffffffff
    <4>           9666f270 1666f228 1666f277 1666f098
    <4>           00000002 80270982 80271016 1666f098
    <4>Krnl Code: 8027097e: f0340dd0a7f1	srp	3536(4,%r0),2033(%r10),4
    <4>           80270984: 0f00		clcl	%r0,%r0
    <4>           80270986: a7840001		brc	8,80270988
    <4>          >8027098a: 18ef		lr	%r14,%r15
    <4>           8027098c: a7faff68		ahi	%r15,-152
    <4>           80270990: 18bf		lr	%r11,%r15
    <4>           80270992: 18a2		lr	%r10,%r2
    <4>           80270994: 1893		lr	%r9,%r3

Modified calltrace with annotated stackframe size of each function:

stackframe size
    |
 0 304 vsnprintf+850 [0x271016]
 1  72 sprintf+74 [0x271522]
 2  56 cio_start_key+262 [0x2d4c16]
 3  56 ccw_device_start_key+222 [0x2dfe92]
 4  56 ccw_device_start+40 [0x2dff28]
 5  48 raw3215_start_io+104 [0x30b0f8]
 6  56 raw3215_write+494 [0x30ba0a]
 7  40 con3215_write+68 [0x30bafc]
 8  40 __call_console_drivers+146 [0x12b0fa]
 9  32 _call_console_drivers+102 [0x12b192]
10  64 release_console_sem+268 [0x12b614]
11 168 vprintk+462 [0x12bca6]
12  72 printk+68 [0x12bfd0]
13 256 __print_symbol+50 [0x15a882]
14  56 __show_trace+162 [0x103d06]
15  32 show_trace+224 [0x103e70]
16  48 show_stack+152 [0x103f20]
17  56 dump_stack+126 [0x104612]
18  96 __alloc_pages_internal+592 [0x175004]
19  80 cache_alloc_refill+776 [0x196f3c]
20  40 __kmalloc+258 [0x1972ae]
21  40 __alloc_skb+94 [0x328086]
22  32 pskb_copy+50 [0x328252]
23  32 skb_realloc_headroom+110 [0x328a72]
24 104 qeth_l2_hard_start_xmit+378 [0x7803bfde]
25  56 dev_hard_start_xmit+450 [0x32ef6e]
26  56 __qdisc_run+390 [0x3425d6]
27  48 dev_queue_xmit+410 [0x331e06]
28  40 ip_finish_output+308 [0x354ac8]
29  56 ip_output+218 [0x355b6e]
30  24 ip_local_out+56 [0x354584]
31 120 ip_queue_xmit+300 [0x355cec]
32  96 tcp_transmit_skb+812 [0x367da8]
33  40 tcp_push_one+158 [0x369fda]
34 112 tcp_sendmsg+852 [0x35d5a0]
35 240 sock_sendmsg+164 [0x32035c]
36  56 kernel_sendmsg+86 [0x32064a]
37  88 sock_no_sendpage+98 [0x322b22]
38 104 tcp_sendpage+70 [0x35cc1e]
39  48 sock_sendpage+74 [0x31eb66]
40  64 pipe_to_sendpage+102 [0x1c4b2e]
41  64 __splice_from_pipe+120 [0x1c5340]
42  72 splice_from_pipe+90 [0x1c57e6]
43  56 generic_splice_sendpage+38 [0x1c5832]
44  48 do_splice_from+104 [0x1c4c38]
45  48 direct_splice_actor+52 [0x1c4c88]
46  80 splice_direct_to_actor+180 [0x1c4f80]
47  72 do_splice_direct+70 [0x1c5112]
48  64 do_sendfile+360 [0x19de18]
49  72 sys_sendfile64+126 [0x19df32]
50 336 sysc_do_restart+18 [0x111a1a]

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:06 +01:00
Heiko Carstens
46e7951f94 [S390] Change default IPL method to IPL_VM.
allyesconfig and allmodconfig built kernels have a tape IPL record.
A the vmreader record makes much more sense, since hardly anybody will
ever IPL a kernel from tape. So change the default.
As I side effect I can test these kernels without fiddling around with
the kernel config ;)

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:06 +01:00
Roel Kluin
13f8b7c5e6 [S390] appldata: unsigned ops->size cannot be negative
unsigned ops->size cannot be negative

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:04 +01:00
Heiko Carstens
da5aae7036 [S390] Fix sysdev class file creation.
Use sysdev_class_create_file() to create create sysdev class attributes
instead of sysfs_create_file(). Using sysfs_create_file() wasn't a very
good idea since the show and store functions have a different amount of
parameters for sysfs files and sysdev class files.
In particular the pointer to the buffer is the last argument and
therefore accesses to random memory regions happened.
Still worked surprisingly well until we got a kernel panic.

Cc: stable@kernel.org
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:03 +01:00
Christian Borntraeger
250cf776f7 [S390] pgtables: Fix race in enable_sie vs. page table ops
The current enable_sie code sets the mm->context.pgstes bit to tell
dup_mm that the new mm should have extended page tables. This bit is also
used by the s390 specific page table primitives to decide about the page
table layout - which means context.pgstes has two meanings. This can cause
any kind of bugs. For example  - e.g. shrink_zone can call
ptep_clear_flush_young while enable_sie is running. ptep_clear_flush_young
will test for context.pgstes. Since enable_sie changed that value of the old
struct mm without changing the page table layout ptep_clear_flush_young will
do the wrong thing.
The solution is to split pgstes into two bits
- one for the allocation
- one for the current state

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-28 11:12:03 +01:00
Matt Helsley
dc52ddc0e6 container freezer: implement freezer cgroup subsystem
This patch implements a new freezer subsystem in the control groups
framework.  It provides a way to stop and resume execution of all tasks in
a cgroup by writing in the cgroup filesystem.

The freezer subsystem in the container filesystem defines a file named
freezer.state.  Writing "FROZEN" to the state file will freeze all tasks
in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
the cgroup.  Reading will return the current state.

* Examples of usage :

   # mkdir /containers/freezer
   # mount -t cgroup -ofreezer freezer  /containers
   # mkdir /containers/0
   # echo $some_pid > /containers/0/tasks

to get status of the freezer subsystem :

   # cat /containers/0/freezer.state
   RUNNING

to freeze all tasks in the container :

   # echo FROZEN > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   FREEZING
   # cat /containers/0/freezer.state
   FROZEN

to unfreeze all tasks in the container :

   # echo RUNNING > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   RUNNING

This is the basic mechanism which should do the right thing for user space
task in a simple scenario.

It's important to note that freezing can be incomplete.  In that case we
return EBUSY.  This means that some tasks in the cgroup are busy doing
something that prevents us from completely freezing the cgroup at this
time.  After EBUSY, the cgroup will remain partially frozen -- reflected
by freezer.state reporting "FREEZING" when read.  The state will remain
"FREEZING" until one of these things happens:

	1) Userspace cancels the freezing operation by writing "RUNNING" to
		the freezer.state file
	2) Userspace retries the freezing operation by writing "FROZEN" to
		the freezer.state file (writing "FREEZING" is not legal
		and returns EIO)
	3) The tasks that blocked the cgroup from entering the "FROZEN"
		state disappear from the cgroup's set of tasks.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: export thaw_process]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Tested-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:52:34 -07:00
Matt Helsley
83224b0837 container freezer: add TIF_FREEZE flag to all architectures
This patch series introduces a cgroup subsystem that utilizes the swsusp
freezer to freeze a group of tasks.  It's immediately useful for batch job
management scripts.  It should also be useful in the future for
implementing container checkpoint/restart.

The freezer subsystem in the container filesystem defines a cgroup file
named freezer.state.  Reading freezer.state will return the current state
of the cgroup.  Writing "FROZEN" to the state file will freeze all tasks
in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
the cgroup.

* Examples of usage :

   # mkdir /containers/freezer
   # mount -t cgroup -ofreezer freezer  /containers
   # mkdir /containers/0
   # echo $some_pid > /containers/0/tasks

to get status of the freezer subsystem :

   # cat /containers/0/freezer.state
   RUNNING

to freeze all tasks in the container :

   # echo FROZEN > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   FREEZING
   # cat /containers/0/freezer.state
   FROZEN

to unfreeze all tasks in the container :

   # echo RUNNING > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   RUNNING

This patch:

The first step in making the refrigerator() available to all
architectures, even for those without power management.

The purpose of such a change is to be able to use the refrigerator() in a
new control group subsystem which will implement a control group freezer.

[akpm@linux-foundation.org: fix sparc]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Acked-by: Pavel Machek <pavel@suse.cz>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@tuxonice.net>
Tested-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:52:33 -07:00
Badari Pulavarty
71088785c6 mm: cleanup to make remove_memory() arch-neutral
There is nothing architecture specific about remove_memory().
remove_memory() function is common for all architectures which support
hotplug memory remove.  Instead of duplicating it in every architecture,
collapse them into arch neutral function.

[akpm@linux-foundation.org: fix the export]
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:50:25 -07:00
Linus Torvalds
08d19f51f0 Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (134 commits)
  KVM: ia64: Add intel iommu support for guests.
  KVM: ia64: add directed mmio range support for kvm guests
  KVM: ia64: Make pmt table be able to hold physical mmio entries.
  KVM: Move irqchip_in_kernel() from ioapic.h to irq.h
  KVM: Separate irq ack notification out of arch/x86/kvm/irq.c
  KVM: Change is_mmio_pfn to kvm_is_mmio_pfn, and make it common for all archs
  KVM: Move device assignment logic to common code
  KVM: Device Assignment: Move vtd.c from arch/x86/kvm/ to virt/kvm/
  KVM: VMX: enable invlpg exiting if EPT is disabled
  KVM: x86: Silence various LAPIC-related host kernel messages
  KVM: Device Assignment: Map mmio pages into VT-d page table
  KVM: PIC: enhance IPI avoidance
  KVM: MMU: add "oos_shadow" parameter to disable oos
  KVM: MMU: speed up mmu_unsync_walk
  KVM: MMU: out of sync shadow core
  KVM: MMU: mmu_convert_notrap helper
  KVM: MMU: awareness of new kvm_mmu_zap_page behaviour
  KVM: MMU: mmu_parent_walk
  KVM: x86: trap invlpg
  KVM: MMU: sync roots on mmu reload
  ...
2008-10-16 15:36:00 -07:00
Linus Torvalds
e4856a70cf Merge branch 'personality' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'personality' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [PATCH] remove unused ibcs2/PER_SVR4 in SET_PERSONALITY
2008-10-16 12:32:52 -07:00
Christoph Hellwig
b418da16dd compat: generic compat get/settimeofday
Nothing arch specific in get/settimeofday.  The details of the timeval
conversion varied a little from arch to arch, but all with the same
results.

Also add an extern declaration for sys_tz to linux/time.h because externs
in .c files are fowned upon.  I'll kill the externs in various other files
in a sparate patch.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Christoph Hellwig
f7a5000f7a compat: move cp_compat_stat to common code
struct stat / compat_stat is the same on all architectures, so
cp_compat_stat should be, too.

Turns out it is, except that various architectures have slightly and some
high2lowuid/high2lowgid or the direct assignment instead of the
SET_UID/SET_GID that expands to the correct one anyway.

This patch replaces the arch-specific cp_compat_stat implementations with
a common one based on the x86-64 one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
Acked-by: Kyle McMartin <kyle@mcmartin.ca> [ parisc bits ]
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Martin Schwidefsky
0b59268285 [PATCH] remove unused ibcs2/PER_SVR4 in SET_PERSONALITY
The SET_PERSONALITY macro is always called with a second argument of 0.
Remove the ibcs argument and the various tests to set the PER_SVR4
personality.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-16 15:40:05 +02:00
Christian Borntraeger
20766c083e KVM: s390: change help text of guest Kconfig
The current help text for CONFIG_S390_GUEST is not very helpful.
Lets add more text.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15 10:15:25 +02:00
Christian Borntraeger
a0046b6db1 KVM: s390: Make facility bits future-proof
Heiko Carstens pointed out, that its safer to activate working facilities
instead of disabling problematic facilities. The new code uses the host
facility bits and masks it with known good ones.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15 10:15:24 +02:00
Steven Whitehouse
a447c09324 vfs: Use const for kernel parser table
This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.

This was posted for review some time ago and I believe its been in -mm
since then.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-13 10:10:37 -07:00
David Woodhouse
e758936e02 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	include/asm-x86/statfs.h
2008-10-13 17:13:56 +01:00
Linus Torvalds
37d9869ed9 Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (27 commits)
  [S390] Fix checkstack for s390
  [S390] fix initialization of stp
  [S390] 3215: Remove tasklet.
  [S390] console flush on panic / reboot
  [S390] introduce dirty bit for kvm live migration
  [S390] Add ioctl support for EMC Symmetrix Subsystem Control I/O
  [S390] xpram: per device block request queues.
  [S390] dasd: fix message flood for unsolicited interrupts
  [S390] Move private simple udelay function to arch/s390/lib/delay.c.
  [S390] dcssblk: add >2G DCSSs support and stacked contiguous DCSSs support.
  [S390] ptrace changes
  [S390] s390: use sys_pause for 31bit pause entry point
  [S390] qdio enhanced SIGA (iqdio) support.
  [S390] cio: fix cio_tpi.
  [S390] cio: Correct use of ! and &
  [S390] cio: inline assembly cleanup
  [S390] bus_id -> dev_set_name() for css and ccw busses
  [S390] bus_id ->dev_name() conversions in qdio
  [S390] Use s390_root_dev_* in kvm_virtio.
  [S390] more bus_id -> dev_name conversions
  ...
2008-10-11 08:50:01 -07:00
Martin Schwidefsky
4a672cfa3a [S390] fix initialization of stp
chsc_sstpc returns -EIO on error and 0 on success but stp_reset checks
against 1 instead of 0. chsc_sstpc used to return 1 on success, one
call location has not been updated ..

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:34:02 +02:00
Florian Funke
15e86b0c75 [S390] introduce dirty bit for kvm live migration
This patch defines a dirty bit in the PGSTE that can be used to implement
dirty pages logging for KVM's live migration. The bit is set in the
ptep_rcp_copy function, which is called to save dirty and referenced information
from the storage key in the PGSTE. The bit can be tested and reset by KVM using
the kvm_s390_test_and_clear_page_dirty function that is introduced by this patch.

Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Florian Funke <ffunke@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:34:00 +02:00
Nigel Hislop
ab1d848fd6 [S390] Add ioctl support for EMC Symmetrix Subsystem Control I/O
EMC Symmetrix Subsystem Control I/O through CKD dasd requires a
specific parameter list sent to the array via a Perform Subsystem
Function CCW. The Symmetrix response is retrieved from the array
via a Read Subsystem Data CCW.

Signed-off-by: Nigel Hislop <hislop_nigel@emc.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:34:00 +02:00
Heiko Carstens
5a0d0e6537 [S390] Move private simple udelay function to arch/s390/lib/delay.c.
Move cio's private simple udelay function to lib/delay.c and turn it
into something much more readable. So we have all implementations
at one place.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:33:58 +02:00
Hongjie Yang
b2300b9efe [S390] dcssblk: add >2G DCSSs support and stacked contiguous DCSSs support.
The DCSS block device driver is modified to add >2G DCSSs support and
allow a DCSS block device to map to a set of contiguous DCSSs.  The
extmem code is also modified to use new Diagnose x'64' subcodes for
>2G DCSSs.

Signed-off-by: Hongjie Yang <hongjie@us.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:33:57 +02:00
Martin Schwidefsky
753c4dd6a2 [S390] ptrace changes
* System call parameter and result access functions
* Add tracehook calls
* Split syscall_trace into two functions do_syscall_trace_enter and
  do_syscall_trace_exit

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:33:57 +02:00
Christoph Hellwig
d86730bb95 [S390] s390: use sys_pause for 31bit pause entry point
sys32_pause is a useless copy of the generic sys_pause.
(and it's certainly not there for old sparc32 binaries..)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:33:56 +02:00
Klaus-Dieter Wacker
7a0f475513 [S390] qdio enhanced SIGA (iqdio) support.
Add support for z10 HiperSockets multiwrite SBALs on output
queues. This is used on LPAR with EDDP enabled devices.

Signed-off-by: Klaus-Dieter Wacker <kdwacker@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:33:55 +02:00
Ingo Molnar
990d0f2ced Merge branches 'sched/devel', 'sched/cpu-hotplug', 'sched/cpusets' and 'sched/urgent' into sched/core 2008-10-08 11:31:02 +02:00
Heiko Carstens
d3d238c774 [S390] nohz: Fix __udelay.
This fixes a regression that came with 934b2857cc
("[S390] nohz/sclp: disable timer on synchronous waits.").
If udelay() gets called from a disabled context it sets the clock comparator
to a value where it expects the next interrupt. When the interrupt happens
the clock comparator gets not reset and therefore the interrupt condition
doesn't get cleared. The result is an endless timer interrupt loop.

In addition this patch fixes also the following:

rcutorture reveals that our __udelay implementation is still buggy,
since it might schedule tasklets, but prevents their execution:

NOHZ: local_softirq_pending 42
NOHZ: local_softirq_pending 02
NOHZ: local_softirq_pending 142
NOHZ: local_softirq_pending 02

To fix this we make sure that only the clock comparator interrupt
is enabled when the enabled wait psw is loaded.
Also no code gets called anymore which might schedule tasklets.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-03 21:55:54 +02:00