Commit Graph

25762 Commits (40ffe67d2e89c7a475421d007becc11a2f88ea3d)

Author SHA1 Message Date
Al Viro 9fcf03d0d6 aio: fix the comment in aio_kick_handler()
It should've been changed when queue_work() became
queue_delayed_work(..., 0) in there.  It's always had been
about not needing a delay, not about not using specific
function...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:40 -04:00
Al Viro cd1ea261ac aio: don't bother with cancel_delayed_work() in exit_aio()
__put_ioctx() will cover it anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:39 -04:00
Al Viro bf50722a3c aio: use cancel_delayed_work_sync()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:39 -04:00
Al Viro 9fa1cb397f aio: aio_nr_lock is taken only synchronously now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:39 -04:00
Al Viro 2dd542b7ae aio: aio_nr decrements don't need to be delayed
we can do that right in __put_ioctx(); as the result, the loop
in ioctx_alloc() can be killed.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:38 -04:00
Al Viro e23754f880 aio: don't bother with async freeing on failure in ioctx_alloc()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:38 -04:00
Kai Bankett 5d026c7242 fs: initial qnx6fs addition
Adds support for qnx6fs readonly support to the linux kernel.

* Mount option
  The option mmi_fs can be used to mount Harman Becker/Audi MMI 3G
  HDD qnx6fs filesystems.

* Documentation
  A high level filesystem stucture description can be found in the
  Documentation/filesystems directory. (qnx6.txt)

* Additional features
  - Active (stable) superblock selection
  - Superblock checksum check (enforced)
  - Supports mount of qnx6 filesystems with to host different endianess
  - Automatic endianess detection
  - Longfilename support (with non-enfocing crc check)
  - All blocksizes (512, 1024, 2048 and 4096 supported)

Signed-off-by: Kai Bankett <chaosman@ontika.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:38 -04:00
Kai Bankett 516cdb68e5 qnx4fs: small cleanup
Small qnx4 cleanup patch.
- removes .writepage, .write_begin and .write_end (+callback functions)
- removes '.' path checking in namei.c (handled on upper layers)

Signed-off-by: Kai Bankett <chaosman@ontika.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:37 -04:00
Al Viro 32991ab305 vfs: d_alloc_root() gone
all callers converted to d_make_root() by now

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:37 -04:00
Al Viro 318ceed088 tidy up after d_make_root() conversion
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:37 -04:00
Al Viro ca85c07809 minixfs: switch to d_make_root()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:36 -04:00
Al Viro 68acb8e60d hfsplus: switch to d_make_root()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:36 -04:00
Al Viro 1688f86046 fat: switch to d_make_root()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:36 -04:00
Al Viro ea29c6950a ntfs: switch to d_make_root()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:35 -04:00
Al Viro 48fde701af switch open-coded instances of d_make_root() to new helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:35 -04:00
Al Viro 6b4231e2f9 procfs: clean proc_fill_super() up
First of all, there's no need to zero ->i_uid/->i_gid on root inode -
both had been set to zero already.  Moreover, let's take the iput()
on failure to the failure exit it belongs to...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:34 -04:00
Al Viro be0d93f0aa ... and the same failure exits cleanup for ocfs2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:34 -04:00
Al Viro f56b0fbc64 coda: clean failure exits in coda_fill_super()
same as for cifs, move iput() to the right place, make it unconditional

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:34 -04:00
Al Viro 064326c077 clean up the failure exits in cifs_read_super()
no need to make that iput() conditional, just take it to the right place...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:33 -04:00
Al Viro 9bcb4b733c vfs: turn generic_drop_inode() into static inline
Once upon a time it used to be much bigger, but these days there's
no point whatsoever keeping it in fs/inode.c, especially since
it's not even needed as initializer for ->drop_inode() - it's the
default and leaving ->drop_inode NULL will do just as well.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:33 -04:00
Al Viro e28e832c3e ecryptfs: don't bother with ->drop_inode()
generic_drop_inode() is the default

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:33 -04:00
Al Viro b57ce9694e vfs: drop_file_write_access() made static
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:32 -04:00
Al Viro 8de5277879 vfs: check i_nlink limits in vfs_{mkdir,rename_dir,link}
New field of struct super_block - ->s_max_links.  Maximal allowed
value of ->i_nlink or 0; in the latter case all checks still need
to be done in ->link/->mkdir/->rename instances.  Note that this
limit applies both to directoris and to non-directories.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20 21:29:32 -04:00
Jason Baron 93dc6107a7 Don't limit non-nested epoll paths
Commit 28d82dc1c4 ("epoll: limit paths") that I did to limit the
number of possible wakeup paths in epoll is causing a few applications
to longer work (dovecot for one).

The original patch is really about limiting the amount of epoll nesting
(since epoll fds can be attached to other fds). Thus, we probably can
allow an unlimited number of paths of depth 1. My current patch limits
it at 1000. And enforce the limits on paths that have a greater depth.

This is captured in: https://bugzilla.redhat.com/show_bug.cgi?id=681578

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-18 12:25:04 -07:00
Linus Torvalds cb1ecf25a8 Merge branch 'akpm' (more patches from Andrew)
Merge some more email patches from Andrew Morton:
 "A couple of nilfs fixes"

* emailed from Andrew Morton <akpm@linux-foundation.org>:
  nilfs2: fix NULL pointer dereference in nilfs_load_super_block()
  nilfs2: clamp ns_r_segments_percentage to [1, 99]
2012-03-16 17:14:55 -07:00
Ryusuke Konishi d7178c79d9 nilfs2: fix NULL pointer dereference in nilfs_load_super_block()
According to the report from Slicky Devil, nilfs caused kernel oops at
nilfs_load_super_block function during mount after he shrank the
partition without resizing the filesystem:

 BUG: unable to handle kernel NULL pointer dereference at 00000048
 IP: [<d0d7a08e>] nilfs_load_super_block+0x17e/0x280 [nilfs2]
 *pde = 00000000
 Oops: 0000 [#1] PREEMPT SMP
 ...
 Call Trace:
  [<d0d7a87b>] init_nilfs+0x4b/0x2e0 [nilfs2]
  [<d0d6f707>] nilfs_mount+0x447/0x5b0 [nilfs2]
  [<c0226636>] mount_fs+0x36/0x180
  [<c023d961>] vfs_kern_mount+0x51/0xa0
  [<c023ddae>] do_kern_mount+0x3e/0xe0
  [<c023f189>] do_mount+0x169/0x700
  [<c023fa9b>] sys_mount+0x6b/0xa0
  [<c04abd1f>] sysenter_do_call+0x12/0x28
 Code: 53 18 8b 43 20 89 4b 18 8b 4b 24 89 53 1c 89 43 24 89 4b 20 8b 43
 20 c7 43 2c 00 00 00 00 23 75 e8 8b 50 68 89 53 28 8b 54 b3 20 <8b> 72
 48 8b 7a 4c 8b 55 08 89 b3 84 00 00 00 89 bb 88 00 00 00
 EIP: [<d0d7a08e>] nilfs_load_super_block+0x17e/0x280 [nilfs2] SS:ESP 0068:ca9bbdcc
 CR2: 0000000000000048

This turned out due to a defect in an error path which runs if the
calculated location of the secondary super block was invalid.

This patch fixes it and eliminates the reported oops.

Reported-by: Slicky Devil <slicky.dvl@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Slicky Devil <slicky.dvl@gmail.com>
Cc: <stable@vger.kernel.org>	[2.6.30+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-16 17:14:44 -07:00
Haogang Chen 3d777a6406 nilfs2: clamp ns_r_segments_percentage to [1, 99]
ns_r_segments_percentage is read from the disk.  Bogus or malicious
value could cause integer overflow and malfunction due to meaningless
disk usage calculation.  This patch reports error when mounting such
bogus volumes.

Signed-off-by: Haogang Chen <haogangchen@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-16 17:14:44 -07:00
Anton Blanchard c017386352 afs: Remote abort can cause BUG in rxrpc code
When writing files to afs I sometimes hit a BUG:

kernel BUG at fs/afs/rxrpc.c:179!

With a backtrace of:

	afs_free_call
	afs_make_call
	afs_fs_store_data
	afs_vnode_store_data
	afs_write_back_from_locked_page
	afs_writepages_region
	afs_writepages

The cause is:

	ASSERT(skb_queue_empty(&call->rx_queue));

Looking at a tcpdump of the session the abort happens because we
are exceeding our disk quota:

	rx abort fs reply store-data error diskquota exceeded (32)

So the abort error is valid. We hit the BUG because we haven't
freed all the resources for the call.

By freeing any skbs in call->rx_queue before calling afs_free_call
we avoid hitting leaking memory and avoid hitting the BUG.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-16 17:01:41 -07:00
Anton Blanchard 2c724fb927 afs: Read of file returns EBADMSG
A read of a large file on an afs mount failed:

# cat junk.file > /dev/null
cat: junk.file: Bad message

Looking at the trace, call->offset wrapped since it is only an
unsigned short. In afs_extract_data:

        _enter("{%u},{%zu},%d,,%zu", call->offset, len, last, count);
...

        if (call->offset < count) {
                if (last) {
                        _leave(" = -EBADMSG [%d < %zu]", call->offset, count);
                        return -EBADMSG;
                }

Which matches the trace:

[cat   ] ==> afs_extract_data({65132},{524},1,,65536)
[cat   ] <== afs_extract_data() = -EBADMSG [0 < 65536]

call->offset went from 65132 to 0. Fix this by making call->offset an
unsigned int.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-16 17:01:41 -07:00
Linus Torvalds f1cbd03f5e Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Been sitting on this for a while, but lets get this out the door.
  This fixes various important bugs for 3.3 final, along with a few more
  trivial ones.  Please pull!"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix ioc leak in put_io_context
  block, sx8: fix pointer math issue getting fw version
  Block: use a freezable workqueue for disk-event polling
  drivers/block/DAC960: fix -Wuninitialized warning
  drivers/block/DAC960: fix DAC960_V2_IOCTL_Opcode_T -Wenum-compare warning
  block: fix __blkdev_get and add_disk race condition
  block: Fix setting bio flags in drivers (sd_dif/floppy)
  block: Fix NULL pointer dereference in sd_revalidate_disk
  block: exit_io_context() should call elevator_exit_icq_fn()
  block: simplify ioc_release_fn()
  block: replace icq->changed with icq->flags
2012-03-14 17:16:45 -07:00
Linus Torvalds 8e8bb96d24 Merge git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French.

* git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Do not kmalloc under the flocks spinlock
  cifs: possible memory leak in xattr.
2012-03-13 17:03:53 -07:00
Al Viro 310fa7a367 restore smp_mb() in unlock_new_inode()
wait_on_inode() doesn't have ->i_lock

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-10 17:07:28 -05:00
Miklos Szeredi 7f6c7e62fc vfs: fix return value from do_last()
complete_walk() returns either ECHILD or ESTALE.  do_last() turns this into
ECHILD unconditionally.  If not in RCU mode, this error will reach userspace
which is complete nonsense.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-10 17:05:30 -05:00
Miklos Szeredi 097b180ca0 vfs: fix double put after complete_walk()
complete_walk() already puts nd->path, no need to do it again at cleanup time.

This would result in Oopses if triggered, apparently the codepath is not too
well exercised.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-10 17:05:30 -05:00
Jan Kara f6940fe909 udf: Fix deadlock in udf_release_file()
udf_release_file() can be called from munmap() path with mmap_sem held.  Thus
we cannot take i_mutex there because that ranks above mmap_sem. Luckily,
i_mutex is not needed in udf_release_file() anymore since protection by
i_data_sem is enough to protect from races with write and truncate.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-10 16:05:38 -05:00
Tyler Hicks 978d6d8c45 vfs: Correctly set the dir i_mutex lockdep class
9a7aa12f39 introduced additional logic around setting the i_mutex
lockdep class for directory inodes. The idea was that some filesystems
may want their own special lockdep class for different directory
inodes and calling unlock_new_inode() should not clobber one of
those special classes.

I believe that the added conditional, around the *negated* return value
of lockdep_match_class(), caused directory inodes to be placed in the
wrong lockdep class.

inode_init_always() sets the i_mutex lockdep class with i_mutex_key for
all inodes. If the filesystem did not change the class during inode
initialization, then the conditional mentioned above was false and the
directory inode was incorrectly left in the non-directory lockdep class.
If the filesystem did set a special lockdep class, then the conditional
mentioned above was true and that class was clobbered with
i_mutex_dir_key.

This patch removes the negation from the conditional so that the i_mutex
lockdep class is properly set for directory inodes. Special classes are
preserved and directory inodes with unmodified classes are set with
i_mutex_dir_key.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-10 16:05:38 -05:00
Al Viro c7b2855505 aio: fix the "too late munmap()" race
Current code has put_ioctx() called asynchronously from aio_fput_routine();
that's done *after* we have killed the request that used to pin ioctx,
so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
from progressing.  As the result, we can end up with async call of
put_ioctx() being the last one and possibly happening during exit_mmap()
or elf_core_dump(), neither of which expects stray munmap() being done
to them...

We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
with that, but that's all we care about - neither io_destroy() nor
exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
does really_put_req(), so the ioctx teardown won't be done until then
and we don't care about the contents of ioctx past that point.

Since actual freeing of these suckers is RCU-delayed, we don't need to
bump ioctx refcount when request goes into list for async removal.
All we need is rcu_read_lock held just over the ->ctx_lock-protected
area in aio_fput_routine().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-09 18:59:59 -08:00
Al Viro 86b62a2cb4 aio: fix io_setup/io_destroy race
Have ioctx_alloc() return an extra reference, so that caller would drop it
on success and not bother with re-grabbing it on failure exit.  The current
code is obviously broken - io_destroy() from another thread that managed
to guess the address io_setup() would've returned would free ioctx right
under us; gets especially interesting if aio_context_t * we pass to
io_setup() points to PROT_READ mapping, so put_user() fails and we end
up doing io_destroy() on kioctx another thread has just got freed...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-09 18:59:59 -08:00
Linus Torvalds 86e0600833 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
 "I have two additional and btrfs fixes in my for-linus branch.  One is
  a casting error that leads to memory corruption on i386 during scrub,
  and the other fixes a corner case in the backref walking code (also
  triggered by scrub)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix casting error in scrub reada code
  btrfs: fix locking issues in find_parent_nodes()
2012-03-09 18:09:18 -08:00
Pavel Shilovsky d5751469f2 CIFS: Do not kmalloc under the flocks spinlock
Reorganize the code to make the memory already allocated before
spinlock'ed loop.

Cc: stable@vger.kernel.org
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2012-03-06 21:50:15 -06:00
Santosh Nayak b0f8ef202e cifs: possible memory leak in xattr.
Memory is allocated irrespective of whether CIFS_ACL is configured
or not. But free is happenning only if CIFS_ACL is set. This is a
possible memory leak scenario.

Fix is:
Allocate and free memory only if CIFS_ACL is configured.

Signed-off-by: Santosh Nayak <santoshprasadnayak@gmail.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2012-03-06 21:46:53 -06:00
Linus Torvalds 71fece9511 Merge git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French

* git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix dentry refcount leak when opening a FIFO on lookup
  CIFS: Fix mkdir/rmdir bug for the non-POSIX case
2012-03-06 16:55:50 -08:00
Linus Torvalds 3e85fb9cd4 Merge branch 'akpm' (Andrew's patch bomb)
Merge the emailed seties of 19 patches from Andrew Morton

* akpm:
  rapidio/tsi721: fix queue wrapping bug in inbound doorbell handler
  memcg: fix mapcount check in move charge code for anonymous page
  mm: thp: fix BUG on mm->nr_ptes
  alpha: fix 32/64-bit bug in futex support
  memcg: fix GPF when cgroup removal races with last exit
  debugobjects: Fix selftest for static warnings
  floppy/scsi: fix setting of BIO flags
  memcg: fix deadlock by inverting lrucare nesting
  drivers/rtc/rtc-r9701.c: fix crash in r9701_remove()
  c2port: class_create() returns an ERR_PTR
  pps: class_create() returns an ERR_PTR, not NULL
  hung_task: fix the broken rcu_lock_break() logic
  vfork: kill PF_STARTING
  coredump_wait: don't call complete_vfork_done()
  vfork: make it killable
  vfork: introduce complete_vfork_done()
  aio: wake up waiters when freeing unused kiocbs
  kprobes: return proper error code from register_kprobe()
  kmsg_dump: don't run on non-error paths by default
2012-03-05 15:50:25 -08:00
Oleg Nesterov 57b59c4a14 coredump_wait: don't call complete_vfork_done()
Now that CLONE_VFORK is killable, coredump_wait() no longer needs
complete_vfork_done().  zap_threads() should find and kill all tasks with
the same ->mm, this includes our parent if ->vfork_done is set.

mm_release() becomes the only caller, unexport complete_vfork_done().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05 15:49:42 -08:00
Oleg Nesterov c415c3b47e vfork: introduce complete_vfork_done()
No functional changes.

Move the clear-and-complete-vfork_done code into the new trivial helper,
complete_vfork_done().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05 15:49:42 -08:00
Jeff Moyer 880641bb9d aio: wake up waiters when freeing unused kiocbs
Bart Van Assche reported a hung fio process when either hot-removing
storage or when interrupting the fio process itself.  The (pruned) call
trace for the latter looks like so:

  fio             D 0000000000000001     0  6849   6848 0x00000004
   ffff880092541b88 0000000000000046 ffff880000000000 ffff88012fa11dc0
   ffff88012404be70 ffff880092541fd8 ffff880092541fd8 ffff880092541fd8
   ffff880128b894d0 ffff88012404be70 ffff880092541b88 000000018106f24d
  Call Trace:
    schedule+0x3f/0x60
    io_schedule+0x8f/0xd0
    wait_for_all_aios+0xc0/0x100
    exit_aio+0x55/0xc0
    mmput+0x2d/0x110
    exit_mm+0x10d/0x130
    do_exit+0x671/0x860
    do_group_exit+0x44/0xb0
    get_signal_to_deliver+0x218/0x5a0
    do_signal+0x65/0x700
    do_notify_resume+0x65/0x80
    int_signal+0x12/0x17

The problem lies with the allocation batching code.  It will
opportunistically allocate kiocbs, and then trim back the list of iocbs
when there is not enough room in the completion ring to hold all of the
events.

In the case above, what happens is that the pruning back of events ends
up freeing up the last active request and the context is marked as dead,
so it is thus responsible for waking up waiters.  Unfortunately, the
code does not check for this condition, so we end up with a hung task.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Reported-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Bart Van Assche <bvanassche@acm.org>
Cc: <stable@kernel.org>		[3.2.x only]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05 15:49:42 -08:00
Al Viro 6414fa6a15 aout: move setup_arg_pages() prior to reading/mapping the binary
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05 13:51:32 -08:00
Linus Torvalds 5483f18e98 vfs: move dentry_cmp from <linux/dcache.h> to fs/dcache.c
It's only used inside fs/dcache.c, and we're going to play games with it
for the word-at-a-time patches.  This time we really don't even want to
export it, because it really is an internal function to fs/dcache.c, and
has been since it was introduced.

Having it in that extremely hot header file (it's included in pretty
much everything, thanks to <linux/fs.h>) is a disaster for testing
different versions, and is utterly pointless.

We really should have some kind of header file diet thing, where we
figure out which parts of header files are really better off private and
only result in more expensive compiles.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-04 15:51:42 -08:00
Chris Mason a175423c83 Btrfs: fix casting error in scrub reada code
The reada code from scrub was casting down a u64 to
an unsigned long so it could insert it into a radix tree.

What it really wanted to do was cast down the result of a shift, instead
of casting down the u64.  The bug resulted in trying to insert our
reada struct into the wrong place, which caused soft lockups and other
problems.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-03-03 07:42:35 -05:00
Li Zefan d3b010640e btrfs: fix locking issues in find_parent_nodes()
- We might unlock head->mutex while it was not locked
- We might leave the function without unlocking delayed_refs->lock

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-03-03 07:41:15 -05:00