Commit graph

186232 commits

Author SHA1 Message Date
Akira Fujita
7247c0caa2 ext4: Fix the NULL reference in double_down_write_data_sem()
If EXT4_IOC_MOVE_EXT ioctl is called with NULL donor_fd, fget() in
ext4_ioctl() gets inappropriate file structure for donor; so we need
to do this check earlier, before calling double_down_write_data_sem().

Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-04 00:34:58 -05:00
Akira Fujita
5fd5249aa3 ext4: Fix insertion point of extent in mext_insert_across_blocks()
If the leaf node has 2 extent space or fewer and EXT4_IOC_MOVE_EXT
ioctl is called with the file offset where after the 2nd extent
covers, mext_insert_across_blocks() always tries to insert extent into
the first extent.  As a result, the file gets corrupted because of
wrong extent order.  The patch fixes this problem.

Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-04 00:31:06 -05:00
Steffen Klassert
7478138782 padata: Allocate the cpumask for the padata instance
The cpumask of the padata instance was used without allocated.
This caused boot crashes if CONFIG_CPUMASK_OFFSTACK is enabled.
This patch fixes this by doing proper allocation for this cpumask.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-03-04 13:30:22 +08:00
Akinobu Mita
731eb1a03a ext4: consolidate in_range() definitions
There are duplicate macro definitions of in_range() in mballoc.h and
balloc.c.  This consolidates these two definitions into ext4.h, and
changes extents.c to use in_range() as well.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger@sun.com>
2010-03-03 23:55:01 -05:00
Akinobu Mita
bda00de7e8 ext4: cleanup to use ext4_grp_offs_to_block()
More cleanup to convert open-coded calculations of the first block
number of a free extent to use ext4_grp_offs_to_block() instead.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger@sun.com>
2010-03-03 23:53:25 -05:00
Akinobu Mita
5661bd6861 ext4: cleanup to use ext4_group_first_block_no()
This is a cleanup and simplification patch which takes some open-coded
calculations to calculate the first block number of a group and
converts them to use the (already defined) ext4_group_first_block_no()
function.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger@sun.com>
2010-03-03 23:53:39 -05:00
Dan Williams
dd58ffcf5a Merge branch 'coh' into dmaengine 2010-03-03 21:22:21 -07:00
Dan Williams
aa4d72ae94 ioat: cleanup ->timer_fn() and ->cleanup_fn() prototypes
If the calling convention of ->timer_fn() and ->cleanup_fn() are unified
across hardware versions we can drop parameters to ioat_init_channel() and
unify ioat_is_dma_complete() implementations.

Both ->timer_fn() and ->cleanup_fn() are modified to expect a struct
dma_chan pointer.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-03 21:21:13 -07:00
Dan Williams
b9cc98697d ioat3: interrupt coalescing
The hardware automatically disables further interrupts after each event
until rearmed.  This allows a delay to be injected between the occurence
of the interrupt and the running of the cleanup routine.  The delay is
scaled by the descriptor backlog and then written to the INTRDELAY
register which specifies the number of microseconds to hold off
interrupt delivery after an interrupt event occurs.  According to
powertop this reduces the interrupt rate from ~5000 intr/s to ~150
intr/s per without affecting throughput (simple dd to a raid6 array).

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-03 21:21:13 -07:00
Dan Williams
aa75db0080 ioat: close potential BUG_ON race in the descriptor cleanup path
Since ioat_cleanup_preamble() and the update of the last completed
descriptor are not synchronized there is a chance that two cleanup threads
can see descriptors to clean.  If the first cleans up all pending
descriptors then the second will trigger the BUG_ON.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-03 21:21:10 -07:00
Linus Torvalds
a27341cd5f Prioritize synchronous signals over 'normal' signals
This makes sure that we pick the synchronous signals caused by a
processor fault over any pending regular asynchronous signals sent to
use by [t]kill().

This is not strictly required semantics, but it makes it _much_ easier
for programs like Wine that expect to find the fault information in the
signal stack.

Without this, if a non-synchronous signal gets picked first, the delayed
asynchronous signal will have its signal context pointing to the new
signal invocation, rather than the instruction that caused the SIGSEGV
or SIGBUS in the first place.

This is not all that pretty, and we're discussing making the synchronous
signals more explicit rather than have these kinds of implicit
preferences of SIGSEGV and friends.  See for example

	http://bugzilla.kernel.org/show_bug.cgi?id=15395

for some of the discussion.  But in the meantime this is a simple and
fairly straightforward work-around, and the whole

	if (x & Y)
		x &= Y;

thing can be compiled into (and gcc does do it) just three instructions:

	movq    %rdx, %rax
	andl    $Y, %eax
	cmovne  %rax, %rdx

so it is at least a simple solution to a subtle issue.

Reported-and-tested-by: Pavel Vilim <wylda@volny.cz>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-03 19:21:10 -08:00
Al Viro
9643f5d94a Merge branch 'for-fsnotify' into for-linus 2010-03-03 17:12:40 -05:00
Jan Kara
9b1d0998d2 ext4: Release page references acquired in ext4_da_block_invalidatepages
We forget to release page references we acquire in
ext4_da_block_invalidatepages.  Luckily, this function gets called only if we
are not able to allocate blocks for delay-allocated data so that function
should better never be called.

Also cleanup handling of index variable.

Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-03 16:19:32 -05:00
Eric W. Biederman
2bd3a997be init: Open /dev/console from rootfs
To avoid potential problems with an empty /dev open /dev/console
from rootfs instead of waiting to mount our root filesystem and
mounting it there.   This effectively guarantees that there will
be a device node, and it won't be on a filesystem that we will
ever unmount, so there are no issues with leaving /dev/console
open and pinning the filesystem.

This is actually more effective than automatically mounting
devtmpfs on /dev because it removes removes the occasionally
problematic assumption that /dev/console exists from the boot
code.

With this patch I was able to throw busybox on my /boot partition
(which has no /dev directory) and boot into userspace without
problems.

The only possible negative consequence I can think of is that
someone out there deliberately used did not use a character device
that is major 5 minor 2 for /dev/console.  Does anyone know of a
situation in which that could make sense?

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:56:07 -05:00
André Goddard Rosa
2329e392ac mqueue: fix typo "failues" -> "failures"
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:48:00 -05:00
André Goddard Rosa
8d8ffefaaf mqueue: only set error codes if they are really necessary
... postponing assignments until they're needed. Doesn't change code size.

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:48:00 -05:00
André Goddard Rosa
04db0dde0e mqueue: simplify do_open() error handling
It reduces code size:
text    data     bss     dec     hex filename
9925      72      16   10013    271d ipc/mqueue-BEFORE.o
9885      72      16    9973    26f5 ipc/mqueue-AFTER.o

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:48:00 -05:00
André Goddard Rosa
8834cf796a mqueue: apply mathematics distributivity on mq_bytes calculation
Code size reduction:
   text    data     bss     dec     hex filename
   9941      72      16   10029    272d ipc/mqueue-BEFORE.o
   9925      72      16   10013    271d ipc/mqueue-AFTER.o

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:48:00 -05:00
André Goddard Rosa
c8308b1c91 mqueue: remove unneeded info->messages initialization
... and abort earlier if we couldn't allocate the message pointers array,
avoiding the u->mq_bytes accounting logic.

It reduces code size:
   text    data     bss     dec     hex filename
   9949      72      16   10037    2735 ipc/mqueue-BEFORE.o
   9941      72      16   10029    272d ipc/mqueue-AFTER.o

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:47:59 -05:00
André Goddard Rosa
4294a8eedb mqueue: fix mq_open() file descriptor leak on user-space processes
We leak fd on lookup_one_len() failure

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:46:05 -05:00
Al Viro
4919c5e45a fix race in d_splice_alias()
rehashing the negative placeholder opens a race with d_lookup();
we unhash it almost immediately (by d_move()), but the race
window is there.  Since d_move() doesn't rely on target being
hashed, we don't need that d_rehash() at all.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:13:08 -05:00
Al Viro
bec1052e5b set S_DEAD on unlink() and non-directory rename() victims
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:12:08 -05:00
Miklos Szeredi
db1f05bb85 vfs: add NOFOLLOW flag to umount(2)
Add a new UMOUNT_NOFOLLOW flag to umount(2).  This is needed to prevent
symlink attacks in unprivileged unmounts (fuse, samba, ncpfs).

Additionally, return -EINVAL if an unknown flag is used (and specify
an explicitly unused flag: UMOUNT_UNUSED).  This makes it possible for
the caller to determine if a flag is supported or not.

CC: Eugene Teo <eugene@redhat.com>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:08:00 -05:00
Al Viro
440b3c6c16 get rid of ->mnt_parent in tomoyo/realpath
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:08:00 -05:00
Al Viro
0ceeca5a08 hppfs can use existing proc_mnt, no need for do_kern_mount() in there
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:08:00 -05:00
Al Viro
8089352a13 Mirror MS_KERNMOUNT in ->mnt_flags
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:08:00 -05:00
Al Viro
d498b25a4f get rid of useless vfsmount_lock use in put_mnt_ns()
It hadn't been needed since we'd sanitized the logics in
mark_mounts_for_expiry() (which, in turn, used to be a
rudiment of bad old times when namespace_sem was per-ns).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:59 -05:00
Al Viro
47cd813f29 Take vfsmount_lock to fs/internal.h
no more users left outside of fs/*.c (and very few outside of
fs/namespace.c, actually)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:59 -05:00
Al Viro
37afdc7960 get rid of insanity with namespace roots in tomoyo
passing *any* namespace root to __d_path() as root is equivalent
to just passing it {NULL, NULL}; no need to bother with finding
the root of our namespace in there.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:59 -05:00
Al Viro
9f5596af44 take check for new events in namespace (guts of mounts_poll()) to namespace.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:59 -05:00
Al Viro
e21e7095a7 Don't mess with generic_permission() under ->d_lock in hpfs
Just use dentry_unhash() there

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:58 -05:00
Al Viro
391e8bbd38 sanitize const/signedness for udf
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:58 -05:00
Al Viro
072f98b463 nilfs: sanitize const/signedness in dealing with ->d_name.name
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:58 -05:00
Al Viro
0319003d0d nilfs really shouldn't slap struct dentry on stack...
... especially when it only needs (and initializes) .d_name of it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:58 -05:00
Al Viro
89031bc797 sanitize const/signedness of ufs a bit
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:57 -05:00
Al Viro
7e7742ee00 sanitize signedness/const for pointers to char in hpfs a bit
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:57 -05:00
Al Viro
1f707137b5 new helper: iterate_mounts()
apply function to vfsmounts in set returned by collect_mounts(),
stop if it returns non-zero.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:57 -05:00
Al Viro
462d60577a fix NFS4 handling of mountpoint stat
RFC says we need to follow the chain of mounts if there's more
than one stacked on that point.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:57 -05:00
Al Viro
3088dd7080 Clean follow_dotdot() up a bit
No need to open-code follow_up() in it and locking can be lighter.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:56 -05:00
Al Viro
de27a5bf9c fix mnt_mountpoint abuse in smack
(mnt,mnt_mountpoint) pair is conceptually wrong; if you want
to use it for generating pathname and for nothing else *and*
if you know that vfsmount tree is unchanging, you can get
away with that, but the right solution for that is (mnt,mnt_root).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:56 -05:00
Al Viro
f694869709 a couple of mntget+dget -> path_get in nfs4proc
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:56 -05:00
Al Viro
6eae7974d0 Switch alloc_nfs_open_context() to struct path
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:56 -05:00
Al Viro
2096f759ab New helper: path_is_under(path1, path2)
Analog of is_subdir for vfsmount,dentry pairs, moved from audit_tree.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:55 -05:00
Valerie Aurora
495d6c9c65 VFS: Clean up shared mount flag propagation
The handling of mount flags in set_mnt_shared() got a little tangled
up during previous cleanups, with the following problems:

* MNT_PNODE_MASK is defined as a literal constant when it should be a
bitwise xor of other MNT_* flags
* set_mnt_shared() clears and then sets MNT_SHARED (part of MNT_PNODE_MASK)
* MNT_PNODE_MASK could use a comment in mount.h
* MNT_PNODE_MASK is a terrible name, change to MNT_SHARED_MASK

This patch fixes these problems.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:55 -05:00
Christoph Hellwig
2ecdc82ef0 kill unused invalidate_inode_pages helper
No one is calling this anymore as everyone has switched to
invalidate_mapping_pages long time ago.  Also update a few
references to it in comments.  nfs has two more, but I can't
easily figure what they are actually referring to, so I left
them as-is.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:55 -05:00
Richard Kennedy
270ba5f7c5 fs: re-order super_block to remove 16 bytes of padding on 64bit builds
re-order structure super_block to remove 16 bytes of alignment padding
on 64bit builds.

This shrinks the size of super_block from 712 to 696 bytes so requiring
one fewer 64 byte cache lines.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>

-----
patch against 2.6.33-rc5
compiled & tested on x86_64 AMDX2 desktop machine.

I've been running with this patch applied for several weeks with no
problems.

regards
Richard
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:55 -05:00
Al Viro
f1771ffaac Simplify failure exits in s390/hypfs fill_super()
->kill_sb() will be called after any failure exit, so no need
to duplicate what it can do.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:54 -05:00
Al Viro
fc7bed8c80 Don't bother with d_genocide in rpc_pipe
kill_litter_super() from ->kill_sb() will take care of the junk
2010-03-03 14:07:54 -05:00
Al Viro
5b7e934d88 Use kill_litter_super() in autofs4 ->kill_sb()
... and get rid of open-coding its guts (i.e. RIP autofs4_force_release())

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:54 -05:00
Al Viro
3899167dbd Get rid of mnt_mountpoint abuses in ext4
path to mnt/mnt->mnt_root is no worse than that to
mnt->mnt_parent/mnt->mnt_mountpoint *and* needs no
pinning the sucker down (mnt is not going away and
mnt->mnt_root won't change)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:54 -05:00