Commit Graph

28382 Commits (df031f0752d50f2061df2847d57ea52a79f7977c)

Author SHA1 Message Date
Alexander Block 34d73f54e2 Btrfs: make aux field of ulist 64 bit
Btrfs send/receive uses the aux field to store inode numbers. On
32 bit machines this may become a problem.

Also fix all users of ulist_add and ulist_add_merged.

Reported-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:53 -04:00
Alexander Block 7e0926fe5f Btrfs: fix use of radix_tree for name_cache in send/receive
We can't easily use the index of the radix tree for inums as the
radix tree uses 32bit indexes on 32bit kernels. For 32bit kernels,
we now use the lower 32bit of the inum as index and an additional
list to store multiple entries per radix tree entry.

Reported-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:52 -04:00
Alexander Block 17589bd96e Btrfs: fix memory leak for name_cache in send/receive
When everything is done, name_cache_free is called which however
forgot to call kfree on the cache entries.

Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:51 -04:00
Alexander Block adbe7fb6c4 Btrfs: don't break in the final loop of find_extent_clone
If we break, we may miss the clone from send_root which we prefer
over all other clones.

Commit is a result of Arne's review.

Reported-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:50 -04:00
Alexander Block 52f9e53ede Btrfs: use normal return path for root == send_root case
Don't have a seperate return path for the mentioned case. Now
we do the same "take lowest inode/offset" logic for all found clones.

Commit is a result of Arne's review.

Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:50 -04:00
Alexander Block 35075bb046 Btrfs: use kmalloc instead of stack for backref_ctx
Make sure to never get in trouble due to the backref_ctx
which was on the stack before.

Commit is a result of Arne's review.

Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:49 -04:00
Alexander Block ee849c0472 Btrfs: rename backref_ctx::found_in_send_root to found_itself
The new name should be easier to understand/read.

Commit is a result of Arne's review.

Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:48 -04:00
Alexander Block d27aed5e24 Btrfs: remove unused use_list from send/receive code
use_list is a leftover and unused.

Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:48 -04:00
Alexander Block ccf1626b49 Btrfs: add correct parent to check_dirs when dir got moved
We only added the parent for the new position of a moved dir.
We also need to add the old parent of the moved dir.

Reported-by: Alex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:47 -04:00
Alexander Block 9ea3ef516d Btrfs: remove unused code with #if 0
fs_path_remove is not used at the moment due to a previous patch.
Remove it for now (with #if 0) to avoid compile warnings.

Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:46 -04:00
Alexander Block b9291affaa Btrfs: add missing check for dir != tmp_dir to is_first_ref
We missed that check which resultet in all refs with the same name
being reported as first_ref.

Reported-by: Alex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:45 -04:00
Alexander Block 1f4692da95 Btrfs: fix cur_ino < parent_ino case for send/receive
When the current inodes inum is smaller then the inum of the
parent directory strange things were happending due to wrong
path resolution and other bugs. Fix this with a new approach
for the problem.

Reported-by: Alex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:45 -04:00
Alexander Block 85a7b33b96 Btrfs: add rdev to get_inode_info in send/receive
We need rdev in the next commit.

Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01 15:18:44 -04:00
Miklos Szeredi 8110e16d42 vfs: dcache: fix deadlock in tree traversal
IBM reported a deadlock in select_parent().  This was found to be caused
by taking rename_lock when already locked when restarting the tree
traversal.

There are two cases when the traversal needs to be restarted:

 1) concurrent d_move(); this can only happen when not already locked,
    since taking rename_lock protects against concurrent d_move().

 2) racing with final d_put() on child just at the moment of ascending
    to parent; rename_lock doesn't protect against this rare race, so it
    can happen when already locked.

Because of case 2, we need to be able to handle restarting the traversal
when rename_lock is already held.  This patch fixes all three callers of
try_to_ascend().

IBM reported that the deadlock is gone with this patch.

[ I rewrote the patch to be smaller and just do the "goto again" if the
  lock was already held, but credit goes to Miklos for the real work.
   - Linus ]

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-29 17:41:40 -07:00
Linus Torvalds 7596824e66 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "A couple of fixes; one for automount/lazy umount race, another a
  classic "we don't protect the refcount transition to zero with the
  lock that protects looking for object in hash" kind of crap in lockd."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  close the race in nlmsvc_free_block()
  do_add_mount()/umount -l races
2012-09-28 10:02:53 -07:00
J. Bruce Fields fd51790949 trivial select_parent documentation fix
"Search list for X" sounds like you're trying to find X on a list.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-27 15:43:08 -07:00
Al Viro c5aa1e554a close the race in nlmsvc_free_block()
we need to grab mutex before the reference counter reaches 0

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-22 20:48:20 -04:00
Al Viro 156cacb1d0 do_add_mount()/umount -l races
normally we deal with lock_mount()/umount races by checking that
mountpoint to be is still in our namespace after lock_mount() has
been done.  However, do_add_mount() skips that check when called
with MNT_SHRINKABLE in flags (i.e. from finish_automount()).  The
reason is that ->mnt_ns may be a temporary namespace created exactly
to contain automounts a-la NFS4 referral handling.  It's not the
namespace of the caller, though, so check_mnt() would fail here.
We still need to check that ->mnt_ns is non-NULL in that case,
though.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-22 20:48:18 -04:00
Linus Torvalds a4be6c77b5 Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fix from Steve French.

* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix return value in cifsConvertToUTF16
2012-09-22 12:36:57 -07:00
Linus Torvalds 789f95b788 xfs: bugfixes for 3.6-rc7
- fix a regression related to xfs_sync_worker racing with unmount.
 - fix a race while discarding xfs buffers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQWO3uAAoJENaLyazVq6ZOjfcP+gJkcJLS5+qmyNEcW2IUH0+E
 4WptMdBCLgZGa54aGAJ2mwg0FysyyTiTXjOSETRiBU+N3bAhgweucRsxc8z+awen
 L+InHr8YgQyAoY0nhEcXI/EuHaF9OlgVT6YCOqr/V4gtLO+aczovQS1wA3w/pjAk
 RWa4z+VlH+D9KenatoCcHSY6PIPO9pLs4Gfb7D/9BLFN+f6OnIaUlkwIQSuumuaw
 Lt/sw24/FEBYyzspmGfJT1fjDZK4VI4QoPEAVuvGiJCGFzSW2RDmlb48ZXsnGBbM
 f83tKjB7praQhXnBt56/S5YThgWzt8eaJVIhSExtEh1tisb5iWNQzVPk+USXUE9t
 DNTxtJjwiECbslyVYkTDUKnhdPGtHkpQSN96RBUDvQYfoLHQ/aXbxfPIZGEt24YM
 A/TbCFDFQrI91Rn3TkAxygvfOkxWxE9TB1PmwfgrJGFDWNxg84OBiCX9IMNi3NUF
 glqoKn6aI5fZH6gHVU7xA+bnfJYYRIxUtgIHJ1sYH6dH185G5Yj3m9bojcN7DnmM
 x1kLf0lscumgdB3OGLgpe5IrrFKM+ncclkS24X3eWOCvnWiEXBwajPqA8LloekZA
 X+IyGhoSfg2yRJAYEipRD+H0XouNM/AsLMcI/VbEoLGebxpsKCkg0VwCbd/4xISO
 90Q9jWXC4dzUVRc60rPw
 =ZcGP
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-v3.6-rc7' of git://oss.sgi.com/xfs/xfs

Pull xfs bugfixes from Ben Myers:
 - fix a regression related to xfs_sync_worker racing with unmount.
 - fix a race while discarding xfs buffers.

* tag 'for-linus-v3.6-rc7' of git://oss.sgi.com/xfs/xfs:
  xfs: stop the sync worker before xfs_unmountfs
  xfs: fix race while discarding buffers [V4]
2012-09-21 12:43:01 -07:00
Linus Torvalds e05e279e6f debugfs: fix u32_array race in format_array_alloc
The format_array_alloc() function is fundamentally racy, in that it
prints the array twice: once to figure out how much space to allocate
for the buffer, and the second time to actually print out the data.

If any of the array contents changes in between, the allocation size may
be wrong, and the end result may be truncated in odd ways.

Just don't do it.  Allocate a maximum-sized array up-front, and just
format the array contents once.  The only user of the u32_array
interfaces is the Xen spinlock statistics code, and it has 31 entries in
the arrays, so the maximum size really isn't that big, and the end
result is much simpler code without the bug.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-21 11:48:05 -07:00
David Rientjes 36048853c5 debugfs: fix race in u32_array_read and allocate array at open
u32_array_open() is racy when multiple threads read from a file with a
seek position of zero, i.e. when two or more simultaneous reads are
occurring after the non-seekable files are created.  It is possible that
file->private_data is double-freed because the threads races between

	kfree(file->private-data);

and

	file->private_data = NULL;

The fix is to only do format_array_alloc() when the file is opened and
free it when it is closed.

Note that because the file has always been non-seekable, you can't open
it and read it multiple times anyway, so the data has always been
generated just once.  The difference is that now it is generated at open
time rather than at the time of the first read, and that avoids the
race.

Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Raghavendra <raghavendra.kt@linux.vnet.ibm.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-21 10:28:17 -07:00
Ben Myers 0ba6e5368c xfs: stop the sync worker before xfs_unmountfs
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 #4 [c5377df4] bad_area_nosemaphore at c06f629b
 #5 [c5377e00] do_page_fault at c070b0cb
 #6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 #3 [cde0fe94] __down_common at c0706098
 #4 [cde0fec8] __down at c0706122
 #5 [cde0fed0] down at c025936f
 #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 #9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a05 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
2012-09-18 16:51:26 -05:00
Jeff Layton c73f693989 cifs: fix return value in cifsConvertToUTF16
This function returns the wrong value, which causes the callers to get
the length of the resulting pathname wrong when it contains non-ASCII
characters.

This seems to fix https://bugzilla.samba.org/show_bug.cgi?id=6767

Cc: <stable@vger.kernel.org>
Reported-by: Baldvin Kovacs <baldvin.kovacs@gmail.com>
Reported-and-Tested-by: Nicolas Lefebvre <nico.lefebvre@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2012-09-18 15:35:25 -05:00
Miklos Szeredi b161dfa693 vfs: dcache: use DCACHE_DENTRY_KILLED instead of DCACHE_DISCONNECTED in d_kill()
IBM reported a soft lockup after applying the fix for the rename_lock
deadlock.  Commit c83ce989cb ("VFS: Fix the nfs sillyrename regression
in kernel 2.6.38") was found to be the culprit.

The nfs sillyrename fix used DCACHE_DISCONNECTED to indicate that the
dentry was killed.  This flag can be set on non-killed dentries too,
which results in infinite retries when trying to traverse the dentry
tree.

This patch introduces a separate flag: DCACHE_DENTRY_KILLED, which is
only set in d_kill() and makes try_to_ascend() test only this flag.

IBM reported successful test results with this patch.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-18 11:23:51 -07:00
Francesco Ruggeri 6bf6104573 fs/proc: fix potential unregister_sysctl_table hang
The unregister_sysctl_table() function hangs if all references to its
ctl_table_header structure are not dropped.

This can happen sometimes because of a leak in proc_sys_lookup():
proc_sys_lookup() gets a reference to the table via lookup_entry(), but
it does not release it when a subsequent call to sysctl_follow_link()
fails.

This patch fixes this leak by making sure the reference is always
dropped on return.

See also commit 076c3eed2c ("sysctl: Rewrite proc_sys_lookup
introducing find_entry and lookup_entry") which reorganized this code in
3.4.

Tested in Linux 3.4.4.

Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-17 10:32:03 -07:00
Linus Torvalds 6167f81fd1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull a btrfs revert from Chris Mason:
 "My for-linus branch has one revert in the new quota code.

  We're building up more fixes at etc for the next merge window, but I'm
  keeping them out unless they are bigger regressions or have a huge
  impact."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Revert "Btrfs: fix some error codes in btrfs_qgroup_inherit()"
2012-09-16 12:58:44 -07:00
Linus Torvalds 3f0c3c8fe3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes
Pull GFS2 fixes from Steven Whitehouse:
 "Here are three GFS2 fixes for the current kernel tree.  These are all
  related to the block reservation code which was added at the merge
  window.  That code will be getting an update at the forthcoming merge
  window too.  In the mean time though there are a few smaller issues
  which should be fixed.

  The first patch resolves an issue with write sizes of greater than 32
  bits with the size hinting code.  The second ensures that the
  allocation data structure is initialised when using xattrs and the
  third takes into account allocations which may have been made by other
  nodes which affect a reservation on the local node."

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
  GFS2: Take account of blockages when using reserved blocks
  GFS2: Fix missing allocation data for set/remove xattr
  GFS2: Make write size hinting code common
2012-09-14 18:05:14 -07:00
Linus Torvalds 1547cb80db - Fixes a regression, introduced in 3.6-rc1, when a file is closed before its
shared memory mapping is dirtied and unmapped. The lower file was being
   released when the eCryptfs file was closed and the dirtied pages could not be
   written out.
 - Adds a call to the lower filesystem's ->flush() from ecryptfs_flush().
 - Fixes a regression, introduced in 2.6.39, when a file is renamed on top of
   another file. The target file's inode was not being evicted and the space
   taken by the file was not reclaimed until eCryptfs was unmounted.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCgAGBQJQU14UAAoJENaSAD2qAscKrpwQAK+Xu3RAHPLO16bEjhrFZaIr
 FY2z8yK5geFzHfmhDxUT7jovw8d2ghAjrE3XHf+PgrqfUqIZ6vo57GFmR8ZNS+30
 exNOb1DffaRqIejStrFkz+Ek4/FbnaPD/FvJYgC8D3Bv04YllyNI9Euq49VkkOiy
 v6jxAy4H+/MCGzYvR4K0jmKlOLuC47TtJ322jXTm0iNeJJH2+kNXm0WvE8dY7dYV
 2QMvv11M1HMI39S4yIY0eBPwbloP3AoO++H3Id7/XWcIlaaeZrVA7HmV8z04DEWq
 Nt2vceSbX4FVBxmLab5TzTftorskJRDLCtMaSM8opve+3Sxgx+wooLskgrHqJjZH
 8AtGfs/ZzvUkKcYMNv+iMax0/KqkGKOhdJMmqc37aI8qkBo5IYXpIPU1ZTy/S2T8
 t3fM8jEI4AOKFn7GlG0kZg7EXo0wWPw3A8X73tCQx8W6sJy5STvv07hFZspX/4nn
 +qhlM7Qzpr1XXPEv9cKo4oXzqOmlSi1Jd2ueGANNGJvche90uCWgmSZCfNNxGnPq
 mLiiiV6KJ94xjjyf/ZWiKLryk5rv0fWiYXzutaORADn6WWwIVKX9IyvJfFI8y+M6
 phREWNdOnyDFFA++h5QotWh3L7ZCL/HNeTpLDSD1ONxosVDhViviA2HrRlyl8LSc
 soDzN7w+yOh+QjlN4sO/
 =UC2V
 -----END PGP SIGNATURE-----

Merge tag 'ecryptfs-3.6-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs

Pull ecryptfs fixes from Tyler Hicks:

 - Fixes a regression, introduced in 3.6-rc1, when a file is closed
   before its shared memory mapping is dirtied and unmapped.  The lower
   file was being released when the eCryptfs file was closed and the
   dirtied pages could not be written out.
 - Adds a call to the lower filesystem's ->flush() from
   ecryptfs_flush().
 - Fixes a regression, introduced in 2.6.39, when a file is renamed on
   top of another file.  The target file's inode was not being evicted
   and the space taken by the file was not reclaimed until eCryptfs was
   unmounted.

* tag 'ecryptfs-3.6-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
  eCryptfs: Copy up attributes of the lower target inode after rename
  eCryptfs: Call lower ->flush() from ecryptfs_flush()
  eCryptfs: Write out all dirty pages just before releasing the lower file
2012-09-14 17:53:55 -07:00
Chris Mason f3a87f1b0c Revert "Btrfs: fix some error codes in btrfs_qgroup_inherit()"
This reverts commit 5986802c2f.

Both paths are not error paths but regular cases where non-qgroup
subvols are involved.

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-09-14 20:06:30 -04:00
Linus Torvalds 55815f7014 vfs: make O_PATH file descriptors usable for 'fstat()'
We already use them for openat() and friends, but fstat() also wants to
be able to use O_PATH file descriptors.  This should make it more
directly comparable to the O_SEARCH of Solaris.

Note that you could already do the same thing with "fstatat()" and an
empty path, but just doing "fstat()" directly is simpler and faster, so
there is no reason not to just allow it directly.

See also commit 332a2e1244, which did the same thing for fchdir, for
the same reasons.

Reported-by: ольга крыжановская <olga.kryzhanovska@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org    # O_PATH introduced in 3.0+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-14 14:48:21 -07:00
Tyler Hicks 8335eafc28 eCryptfs: Copy up attributes of the lower target inode after rename
After calling into the lower filesystem to do a rename, the lower target
inode's attributes were not copied up to the eCryptfs target inode. This
resulted in the eCryptfs target inode staying around, rather than being
evicted, because i_nlink was not updated for the eCryptfs inode. This
also meant that eCryptfs didn't do the final iput() on the lower target
inode so it stayed around, as well. This would result in a failure to
free up space occupied by the target file in the rename() operation.
Both target inodes would eventually be evicted when the eCryptfs
filesystem was unmounted.

This patch calls fsstack_copy_attr_all() after the lower filesystem
does its ->rename() so that important inode attributes, such as i_nlink,
are updated at the eCryptfs layer. ecryptfs_evict_inode() is now called
and eCryptfs can drop its final reference on the lower inode.

http://launchpad.net/bugs/561129

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Cc: <stable@vger.kernel.org> [2.6.39+]
2012-09-14 09:36:03 -07:00
Tyler Hicks 64e6651dcc eCryptfs: Call lower ->flush() from ecryptfs_flush()
Since eCryptfs only calls fput() on the lower file in
ecryptfs_release(), eCryptfs should call the lower filesystem's
->flush() from ecryptfs_flush().

If the lower filesystem implements ->flush(), then eCryptfs should try
to flush out any dirty pages prior to calling the lower ->flush(). If
the lower filesystem does not implement ->flush(), then eCryptfs has no
need to do anything in ecryptfs_flush() since dirty pages are now
written out to the lower filesystem in ecryptfs_release().

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2012-09-14 09:35:54 -07:00
Tyler Hicks 7149f2558d eCryptfs: Write out all dirty pages just before releasing the lower file
Fixes a regression caused by:

821f749 eCryptfs: Revert to a writethrough cache model

That patch reverted some code (specifically, 32001d6f) that was
necessary to properly handle open() -> mmap() -> close() -> dirty pages
-> munmap(), because the lower file could be closed before the dirty
pages are written out.

Rather than reapplying 32001d6f, this approach is a better way of
ensuring that the lower file is still open in order to handle writing
out the dirty pages. It is called from ecryptfs_release(), while we have
a lock on the lower file pointer, just before the lower file gets the
final fput() and we overwrite the pointer.

https://launchpad.net/bugs/1047261

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reported-by: Artemy Tregubenko <me@arty.name>
Tested-by: Artemy Tregubenko <me@arty.name>
Tested-by: Colin Ian King <colin.king@canonical.com>
2012-09-14 09:11:29 -07:00
Steven Whitehouse 62e252eeef GFS2: Take account of blockages when using reserved blocks
The claim_reserved_blks() function was not taking account of
the possibility of "blockages" while performing allocation.
This can be caused by another node allocating something in
the same extent which has been reserved locally.

This patch tests for this condition and then skips the remainder
of the reservation in this case. This is a relatively rare event,
so that it should not affect the general performance improvement
which the block reservations provide.

The claim_reserved_blks() function also appears not to be able
to deal with reservations which cross bitmap boundaries, but
that can be dealt with in a future patch since we don't generate
boundary crossing reservations currently.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: David Teigland <teigland@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
2012-09-13 10:30:58 +01:00
Steven Whitehouse 645b2ccc75 GFS2: Fix missing allocation data for set/remove xattr
These entry points were missed in the original patch to allocate
this data structure.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-09-13 10:30:34 +01:00
Steven Whitehouse da1dfb6af8 GFS2: Make write size hinting code common
This collects up the write size hinting code which is used by the
block reservation subsystem into a single function. At the same
time this also corrects the rounding for this calculation.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-09-13 10:30:00 +01:00
Linus Torvalds 22b4e63ebe NFS client bugfixes for Linux 3.6
- Final (hopefully) fix for the range checking code in NFSv4 getacl. This
   should fix the Oopses being seen when the acl size is close to PAGE_SIZE.
 - Fix a regression with the legacy binary mount code
 - Fix a regression in the readdir cookieverf initialisation
 - Fix an RPC over UDP regression
 - Ensure that we report all errors in the NFSv4 open code
 - Ensure that fsync() reports all relevant synchronisation errors.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQUN+KAAoJEGcL54qWCgDyHGcQAKj7MYVDIjhdmsVGGNWXUCnf
 X0LVg/ajh+vjusK+hmquzcJikZqgce5IU5DW4vcFr1X8BgP+R51UVvU0KksByD5H
 ourV2JVCztAQzQ4WWOsZAGqN0tooJUjyEjl4lEiDsQCF4Nk1HWbuCHeYuX74OToZ
 jrgedj0EZ6zb7TOizvbgU/7lI+FKu3Hlw6+u27M9phtSuefJdYSHZHYVMOX81qPh
 k0zgZ4tuLIaDuBB84iCrPwNt9icnevq6cIc+AGluI6xhDw+foPvUaUR+OUI420IZ
 tunNzP2So+nNoyjEiyMVENaCdEyA75XAmmGHTUUdBiVOsMV4HF/TqvTtSsjk2mN1
 FbZVvtjD6srjsQaKdVmqMIZBdhY9LSMLIQVqb4H2rYP6Mwq06WTuyCxf5YhzFfoy
 2tai7JuqBkTAWfKB8ESWywV6Qk/MkUWRAOBO6ksS66gAwpcFDj6nfeAdwaEmoYKc
 uzLUIRZaclPMZf661cs1fWeFV5XOnCL7je4owgTRGs7MHooWHPcC3273fEJqnhFz
 5MkC7nfmUiGcdO1v0mfYTEtMj9Pp9icBoZcVTGn4eZIHzvhhZOx//8LhyBfS+jll
 bKjaLZ1rErvIqwnSGcB7PK2yBYY9P6ZaxWjOrAAncZmiOxfhN0hvCo54jNOr/VZ+
 atsDEAuqSTeK7ouBqyO4
 =e5yE
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.6-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:

 - Final (hopefully) fix for the range checking code in NFSv4 getacl.
   This should fix the Oopses being seen when the acl size is close to
   PAGE_SIZE.
 - Fix a regression with the legacy binary mount code
 - Fix a regression in the readdir cookieverf initialisation
 - Fix an RPC over UDP regression
 - Ensure that we report all errors in the NFSv4 open code
 - Ensure that fsync() reports all relevant synchronisation errors.

* tag 'nfs-for-3.6-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: fsync() must exit with an error if page writeback failed
  SUNRPC: Fix a UDP transport regression
  NFS: return error from decode_getfh in decode open
  NFSv4: Fix buffer overflow checking in __nfs4_get_acl_uncached
  NFSv4: Fix range checking in __nfs4_get_acl_uncached and __nfs4_proc_set_acl
  NFS: Fix a problem with the legacy binary mount code
  NFS: Fix the initialisation of the readdir 'cookieverf' array
2012-09-13 09:04:13 +08:00
Trond Myklebust 7b281ee026 NFS: fsync() must exit with an error if page writeback failed
We need to ensure that if the call to filemap_write_and_wait_range()
fails, then we report that error back to the application.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-11 15:38:32 -04:00
Linus Torvalds 44346cfe4d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull FUSE fixes from Miklos Szeredi:
 "This contains bugfixes for FUSE and CUSE and a compile warning fix."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix retrieve length
  fuse: mark variables uninitialized
  cuse: kill connection on initialization error
  cuse: fix fuse_conn_kill()
2012-09-11 09:29:17 +08:00
Linus Torvalds e9bd8f1624 Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French.

* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Fix endianness conversion
  CIFS: Fix error handling in cifs_push_mandatory_locks
2012-09-11 05:52:49 +08:00
Linus Torvalds 1b0a9069de Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF and ext3 fixes from Jan Kara:
 "One UDF data corruption fix and one ext3 fix where we didn't write
  everything to disk on fsync in one corner case."

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Fix data corruption for files in ICB
  ext3: Fix fdatasync() for files with only i_size changes
2012-09-11 05:51:35 +08:00
Weston Andros Adamson 01913b49cf NFS: return error from decode_getfh in decode open
If decode_getfh failed, nfs4_xdr_dec_open would return 0 since the last
decode_* call must have succeeded.

Cc: stable@vger.kernel.org
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-06 16:01:33 -04:00
Pavel Shilovsky b2ede58e98 CIFS: Fix endianness conversion
Signed-off-by: Pavel Shilovsky <pshilovsky@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
2012-09-06 12:42:35 -05:00
Pavel Shilovsky e2f2886a82 CIFS: Fix error handling in cifs_push_mandatory_locks
Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilovsky@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
2012-09-06 12:42:31 -05:00
Trond Myklebust 1f1ea6c2d9 NFSv4: Fix buffer overflow checking in __nfs4_get_acl_uncached
Pass the checks made by decode_getacl back to __nfs4_get_acl_uncached
so that it knows if the acl has been truncated.

The current overflow checking is broken, resulting in Oopses on
user-triggered nfs4_getfacl calls, and is opaque to the point
where several attempts at fixing it have failed.
This patch tries to clean up the code in addition to fixing the
Oopses by ensuring that the overflow checks are performed in
a single place (decode_getacl). If the overflow check failed,
we will still be able to report the acl length, but at least
we will no longer attempt to cache the acl or copy the
truncated contents to user space.

Reported-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Sachin Prabhu <sprabhu@redhat.com>
2012-09-06 11:11:53 -04:00
Jan Kara 9c2fc0de1a udf: Fix data corruption for files in ICB
When a file is stored in ICB (inode), we overwrite part of the file, and
the page containing file's data is not in page cache, we end up corrupting
file's data by overwriting them with zeros. The problem is we use
simple_write_begin() which simply zeroes parts of the page which are not
written to. The problem has been introduced by be021ee4 (udf: convert to
new aops).

Fix the problem by providing a ->write_begin function which makes the page
properly uptodate.

CC: <stable@vger.kernel.org> # >= 2.6.24
Reported-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
2012-09-05 16:06:03 +02:00
Trond Myklebust 21f498c2f7 NFSv4: Fix range checking in __nfs4_get_acl_uncached and __nfs4_proc_set_acl
Ensure that the user supplied buffer size doesn't cause us to overflow
the 'pages' array.

Also fix up some confusion between the use of PAGE_SIZE and
PAGE_CACHE_SIZE when calculating buffer sizes. We're not using
the page cache for anything here.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-04 14:52:43 -04:00
Trond Myklebust 872ece86ea NFS: Fix a problem with the legacy binary mount code
Apparently, am-utils is still using the legacy binary mountdata interface,
and is having trouble parsing /proc/mounts due to the 'port=' field being
incorrectly set.

The following patch should fix up the regression.

Reported-by: Marius Tolzmann <tolzmann@molgen.mpg.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-09-04 14:52:43 -04:00
Trond Myklebust c3f52af3e0 NFS: Fix the initialisation of the readdir 'cookieverf' array
When the NFS_COOKIEVERF helper macro was converted into a static
inline function in commit 99fadcd764 (nfs: convert NFS_*(inode)
helpers to static inline), we broke the initialisation of the
readdir cookies, since that depended on doing a memset with an
argument of 'sizeof(NFS_COOKIEVERF(inode))' which therefore
changed from sizeof(be32 cookieverf[2]) to sizeof(be32 *).

At this point, NFS_COOKIEVERF seems to be more of an obfuscation
than a helper, so the best thing would be to just get rid of it.

Also see: https://bugzilla.kernel.org/show_bug.cgi?id=46881

Reported-by: Andi Kleen <andi@firstfloor.org>
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-09-04 14:52:42 -04:00
Miklos Szeredi c9e67d4837 fuse: fix retrieve length
In some cases fuse_retrieve() would return a short byte count if offset was
non-zero.  The data returned was correct, though.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: stable@vger.kernel.org
2012-09-04 18:45:54 +02:00
Jan Kara 156bddd8e5 ext3: Fix fdatasync() for files with only i_size changes
Code tracking when transaction needs to be committed on fdatasync(2) forgets
to handle a situation when only inode's i_size is changed. Thus in such
situations fdatasync(2) doesn't force transaction with new i_size to disk
and that can result in wrong i_size after a crash.

Fix the issue by updating inode's i_datasync_tid whenever its size is
updated.

CC: <stable@vger.kernel.org> # >= 2.6.32
Reported-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Signed-off-by: Jan Kara <jack@suse.cz>
2012-09-04 00:04:43 +02:00
Daniel Mack 381bf7cad9 fuse: mark variables uninitialized
gcc 4.6.3 complains about uninitialized variables in fs/fuse/control.c:

  CC      fs/fuse/control.o
fs/fuse/control.c: In function 'fuse_conn_congestion_threshold_write':
fs/fuse/control.c:165:29: warning: 'val' may be used uninitialized in this function [-Wuninitialized]
fs/fuse/control.c: In function 'fuse_conn_max_background_write':
fs/fuse/control.c:128:23: warning: 'val' may be used uninitialized in this function [-Wuninitialized]

fuse_conn_limit_write() will always return non-zero unless the &val
is modified, so the warning is misleading. Let the compiler know
about it by marking 'val' with 'uninitialized_var'.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Cc: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2012-09-03 17:44:06 +02:00
Linus Torvalds 5b716ac728 Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French.

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Fix cifs_do_create error hadnling
  cifs: print error code if smb signature verification fails
  CIFS: Fix log messages in packet checking for SMB2
  CIFS: Protect i_nlink from being negative
2012-09-02 11:30:10 -07:00
Miklos Szeredi 8d39d801d6 cuse: kill connection on initialization error
Luca Risolia reported that a CUSE daemon will continue to run even if
initialization of the emulated device failes for some reason (e.g. the device
number is already registered by another driver).

This patch disconnects the fuse device on error, which will make the userspace
CUSE daemon exit, albeit without indication about what the problem was.

Reported-by: Luca Risolia <luca.risolia@studio.unibo.it>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2012-08-30 19:24:35 +02:00
Miklos Szeredi bbd9979797 cuse: fix fuse_conn_kill()
fuse_conn_kill() removed fc->entry, called fuse_ctl_remove_conn() and
fuse_bdi_destroy().  None of which is appropriate for cuse cleanup.

The fuse_ctl_remove_conn() decrements the nlink on the control filesystem, which
is totally bogus.  The others are harmless but unnecessary.

So move these out from fuse_conn_kill() to fuse_put_super() where they belong.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2012-08-30 19:24:34 +02:00
Carlos Maiolino 6fb8a90aa3 xfs: fix race while discarding buffers [V4]
While xfs_buftarg_shrink() is freeing buffers from the dispose list (filled with
buffers from lru list), there is a possibility to have xfs_buf_stale() racing
with it, and removing buffers from dispose list before xfs_buftarg_shrink() does
it.

This happens because xfs_buftarg_shrink() handle the dispose list without
locking and the test condition in xfs_buf_stale() checks for the buffer being in
*any* list:

if (!list_empty(&bp->b_lru))

If the buffer happens to be on dispose list, this causes the buffer counter of
lru list (btp->bt_lru_nr) to be decremented twice (once in xfs_buftarg_shrink()
and another in xfs_buf_stale()) causing a wrong account usage of the lru list.

This may cause xfs_buftarg_shrink() to return a wrong value to the memory
shrinker shrink_slab(), and such account error may also cause an underflowed
value to be returned; since the counter is lower than the current number of
items in the lru list, a decrement may happen when the counter is 0, causing
an underflow on the counter.

The fix uses a new flag field (and a new buffer flag) to serialize buffer
handling during the shrink process. The new flag field has been designed to use
btp->bt_lru_lock/unlock instead of xfs_buf_lock/unlock mechanism.

dchinner, sandeen, aquini and aris also deserve credits for this.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-29 15:01:11 -05:00
Linus Torvalds 318e151019 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "I've split out the big send/receive update from my last pull request
  and now have just the fixes in my for-linus branch.  The send/recv
  branch will wander over to linux-next shortly though.

  The largest patches in this pull are Josef's patches to fix DIO
  locking problems and his patch to fix a crash during balance.  They
  are both well tested.

  The rest are smaller fixes that we've had queued.  The last rc came
  out while I was hacking new and exciting ways to recover from a
  misplaced rm -rf on my dev box, so these missed rc3."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (25 commits)
  Btrfs: fix that repair code is spuriously executed for transid failures
  Btrfs: fix ordered extent leak when failing to start a transaction
  Btrfs: fix a dio write regression
  Btrfs: fix deadlock with freeze and sync V2
  Btrfs: revert checksum error statistic which can cause a BUG()
  Btrfs: remove superblock writing after fatal error
  Btrfs: allow delayed refs to be merged
  Btrfs: fix enospc problems when deleting a subvol
  Btrfs: fix wrong mtime and ctime when creating snapshots
  Btrfs: fix race in run_clustered_refs
  Btrfs: don't run __tree_mod_log_free_eb on leaves
  Btrfs: increase the size of the free space cache
  Btrfs: barrier before waitqueue_active
  Btrfs: fix deadlock in wait_for_more_refs
  btrfs: fix second lock in btrfs_delete_delayed_items()
  Btrfs: don't allocate a seperate csums array for direct reads
  Btrfs: do not strdup non existent strings
  Btrfs: do not use missing devices when showing devname
  Btrfs: fix that error value is changed by mistake
  Btrfs: lock extents as we map them in DIO
  ...
2012-08-29 11:36:22 -07:00
Stefan Behrens 256dd1bb37 Btrfs: fix that repair code is spuriously executed for transid failures
If verify_parent_transid() fails for all mirrors, the current code
calls repair_io_failure() anyway which means:
- that the disk block is rewritten without repairing anything and
- that a kernel log message is printed which misleadingly claims
  that a read error was corrected.

This is an example:
parent transid verify failed on 615015833600 wanted 110423 found 110424
parent transid verify failed on 615015833600 wanted 110423 found 110424
btrfs read error corrected: ino 1 off 615015833600 (dev /dev/...)

It is wrong to ignore the results from verify_parent_transid() and to
call repair_eb_io_failure() when the verification of the transids failed.
This commit fixes the issue.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-08-28 16:53:43 -04:00
Liu Bo d280e5be94 Btrfs: fix ordered extent leak when failing to start a transaction
We cannot just return error before freeing ordered extent and releasing reserved
space when we fail to start a transacion.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-08-28 16:53:42 -04:00
Liu Bo 24c03fa5cf Btrfs: fix a dio write regression
This bug is introduced by commit 3b8bde746f6f9bd36a9f05f5f3b6e334318176a9
(Btrfs: lock extents as we map them in DIO).

In dio write, we should unlock the section which we didn't do IO on in case that
we fall back to buffered write.  But we need to not only unlock the section
but also cleanup reserved space for the section.

This bug was found while running xfstests 133, with this 133 no longer complains.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-08-28 16:53:41 -04:00
Josef Bacik bd7de2c9a4 Btrfs: fix deadlock with freeze and sync V2
We can deadlock with freeze right now because we unconditionally start a
transaction in our ->sync_fs() call.  To fix this just check and see if we
have a running transaction to commit.  This saves us from the deadlock
because at this point we'll have the umount sem for the sb so we're safe
from freezes coming in after we've done our check.  With this patch the
freeze xfstests no longer deadlocks.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-08-28 16:53:40 -04:00
Stefan Behrens 5ee0844d64 Btrfs: revert checksum error statistic which can cause a BUG()
Commit 442a4f6308 added btrfs device
statistic counters for detected IO and checksum errors to Linux 3.5.
The statistic part that counts checksum errors in
end_bio_extent_readpage() can cause a BUG() in a subfunction:
"kernel BUG at fs/btrfs/volumes.c:3762!"
That part is reverted with the current patch.
However, the counting of checksum errors in the scrub context remains
active, and the counting of detected IO errors (read, write or flush
errors) in all contexts remains active.

Cc: stable <stable@vger.kernel.org> # 3.5
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-08-28 16:53:39 -04:00
Stefan Behrens 68ce9682a4 Btrfs: remove superblock writing after fatal error
With commit acce952b0, btrfs was changed to flag the filesystem with
BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
error happened like a write I/O errors of all mirrors.
In such situations, on unmount, the superblock is written in
btrfs_error_commit_super(). This is done with the intention to be able
to evaluate the error flag on the next mount. A warning is printed
in this case during the next mount and the log tree is ignored.

The issue is that it is possible that the superblock points to a root
that was not written (due to write I/O errors).
The result is that the filesystem cannot be mounted. btrfsck also does
not start and all the other btrfs-progs tools fail to start as well.
However, mount -o recovery is working well and does the right things
to recover the filesystem (i.e., don't use the log root, clear the
free space cache and use the next mountable root that is stored in the
root backup array).

This patch removes the writing of the superblock when
BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
flag in the mount function.

These lines can be used to reproduce the issue (using /dev/sdm):
SCRATCH_DEV=/dev/sdm
SCRATCH_MNT=/mnt
echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup create foo
ls -alLF /dev/mapper/foo
mkfs.btrfs /dev/mapper/foo
mount /dev/mapper/foo $SCRATCH_MNT
echo bar > $SCRATCH_MNT/foo
sync
echo 0 25165824 error | dmsetup reload foo
dmsetup resume foo
ls -alF $SCRATCH_MNT
touch $SCRATCH_MNT/1
ls -alF $SCRATCH_MNT
sleep 35
echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup reload foo
dmsetup resume foo
sleep 1
umount $SCRATCH_MNT
btrfsck /dev/mapper/foo
dmsetup remove foo

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-08-28 16:53:38 -04:00
Josef Bacik ae1e206b80 Btrfs: allow delayed refs to be merged
Daniel Blueman reported a bug with fio+balance on a ramdisk setup.
Basically what happens is the balance relocates a tree block which will drop
the implicit refs for all of its children and adds a full backref.  Once the
block is relocated we have to add the implicit refs back, so when we cow the
block again we add the implicit refs for its children back.  The problem
comes when the original drop ref doesn't get run before we add the implicit
refs back.  The delayed ref stuff will specifically prefer ADD operations
over DROP to keep us from freeing up an extent that will have references to
it, so we try to add the implicit ref before it is actually removed and we
panic.  This worked fine before because the add would have just canceled the
drop out and we would have been fine.  But the backref walking work needs to
be able to freeze the delayed ref stuff in time so we have this ever
increasing sequence number that gets attached to all new delayed ref updates
which makes us not merge refs and we run into this issue.

So to fix this we need to merge delayed refs.  So everytime we run a
clustered ref we need to try and merge all of its delayed refs.  The backref
walking stuff locks the delayed ref head before processing, so if we have it
locked we are safe to merge any refs inside of the sequence number.  If
there is no sequence number we can merge all refs.  Doing this not only
fixes our bug but keeps the delayed ref code from adding and removing
useless refs and batching together multiple refs into one search instead of
one search per delayed ref, which will really help our commit times.  I ran
this with Daniels test and 276 and I haven't seen any problems.  Thanks,

Reported-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-08-28 16:53:38 -04:00
Josef Bacik 5a24e84c55 Btrfs: fix enospc problems when deleting a subvol
Subvol delete is a special kind of awful where we use the global reserve to
cover the ENOSPC requirements.  The problem is once we're done removing
everything we do a btrfs_update_inode(), which by default will try to do the
delayed update stuff which will use it's own reserve.  There will be no
space in this reserve and we'll return ENOSPC.  So instead use
btrfs_update_inode_fallback() which will just fallback to updating the inode
item in the case of enospc.  This is fine because the global reserve covers
the space requirements for this.  With this patch I can now delete a subvol
on a problem image Dave Sterba sent me.  Thanks,

Reported-by: David Sterba <dave@jikos.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-28 16:53:37 -04:00
Miao Xie c0f62dedd0 Btrfs: fix wrong mtime and ctime when creating snapshots
When we created a new snapshot, the mtime and ctime of its parent directory
were not updated. Fix it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-28 16:53:36 -04:00
Arne Jansen 22cd2e7de7 Btrfs: fix race in run_clustered_refs
With commit

commit d1270cd91f
Author: Arne Jansen <sensille@gmx.net>
Date:   Tue Sep 13 15:16:43 2011 +0200

     Btrfs: put back delayed refs that are too new

I added a window where the delayed_ref's head->ref_mod code can diverge
from the sum of the remaining refs, because we release the head->mutex
in the middle. This leads to btrfs_lookup_extent_info returning wrong
numbers. This patch fixes this by adjusting the head's ref_mod with each
delayed ref we run.

Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-28 16:53:35 -04:00
Chris Mason b12a3b1ea2 Btrfs: don't run __tree_mod_log_free_eb on leaves
When we split a leaf, we may end up inserting a new root on top of that
leaf.  The reflog code was incorrectly assuming the old root was always
a node.  This makes sure we skip over leaves.

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-28 16:53:34 -04:00
Josef Bacik 6fc823b10f Btrfs: increase the size of the free space cache
Arne was complaining about the space cache having mismatching generation
numbers when debugging a deadlock.  This is because we can run out of space
in our preallocated range for our space cache if you have a pretty
fragmented amount of space in your pinned space.  So just increase the
amount of space we preallocate for space cache so we can be sure to have
enough space.  This will only really affect data ranges since their the only
chunks that end up larger than 256MB.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-28 16:53:34 -04:00
Josef Bacik 66657b318e Btrfs: barrier before waitqueue_active
We need a barrir before calling waitqueue_active otherwise we will miss
wakeups.  So in places that do atomic_dec(); then atomic_read() use
atomic_dec_return() which imply a memory barrier (see memory-barriers.txt)
and then add an explicit memory barrier everywhere else that need them.
Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-08-28 16:53:33 -04:00
Arne Jansen 1fa11e265f Btrfs: fix deadlock in wait_for_more_refs
Commit a168650c introduced a waiting mechanism to prevent busy waiting in
btrfs_run_delayed_refs. This can deadlock with btrfs_run_ordered_operations,
where a tree_mod_seq is held while waiting for the io to complete, while
the end_io calls btrfs_run_delayed_refs.
This whole mechanism is unnecessary. If not enough runnable refs are
available to satisfy count, just return as count is more like a guideline
than a strict requirement.
In case we have to run all refs, commit transaction makes sure that no
other threads are working in the transaction anymore, so we just assert
here that no refs are blocked.

Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-28 16:53:32 -04:00
Fengguang Wu 6209526531 btrfs: fix second lock in btrfs_delete_delayed_items()
Fix a real bug caught by coccinelle.

fs/btrfs/delayed-inode.c:1013:1-11: second lock on line 1013

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
2012-08-28 16:53:31 -04:00
Josef Bacik c329861da4 Btrfs: don't allocate a seperate csums array for direct reads
We've been allocating a big array for csums instead of storing them in the
io_tree like we do for buffered reads because previously we were locking the
entire range, so we didn't have an extent state for each sector of the
range.  But now that we do the range locking as we map the buffers we can
limit the mapping lenght to sectorsize and use the private part of the
io_tree for our csums.  This allows us to avoid an extra memory allocation
for direct reads which could incur latency.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-08-28 16:53:30 -04:00
Josef Bacik 99f5944b84 Btrfs: do not strdup non existent strings
When we close devices we add back empty devices for some reason that escapes
me.  In the case of a missing dev we don't allocate an rcu_string for it's
name, so check to see if the device has a name and if it doesn't don't
bother strdup()'ing it.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-08-28 16:53:29 -04:00
Josef Bacik aa9ddcd4b5 Btrfs: do not use missing devices when showing devname
If you do the following

mkfs.btrfs /dev/sdb /dev/sdc
rmmod btrfs
dd if=/dev/zero of=/dev/sdb bs=1M count=1
mount -o degraded /dev/sdc /mnt/btrfs-test

the box will panic trying to deref the name for the missing dev since it is
the lower numbered devid.  So fix show_devname to not use missing devices.
Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-08-28 16:53:29 -04:00
Stefan Behrens 3627bf4503 Btrfs: fix that error value is changed by mistake
In iterate_inodes_from_logical() the error result from
extent_from_logical() is patched by mistake. Typically ENOENT is
patched to EINVAL because (-ENOENT & BTRFS_EXTENT_FLAG_TREE_BLOCK)
evaluates to true.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
2012-08-28 16:53:28 -04:00
Josef Bacik eb838e73dc Btrfs: lock extents as we map them in DIO
A deadlock in xfstests 113 was uncovered by commit

d187663ef2

This is because we would not return EIOCBQUEUED for short AIO reads, instead
we'd wait for the DIO to complete and then return the amount of data we
transferred, which would allow our stuff to unlock the remaning amount.  But
with this change this no longer happens, so if we have a short AIO read (for
example if we try to read past EOF), we could leave the section from EOF to
the end of where we tried to read locked.  Fixing this is tricky since there
is no clear way to know exactly how much data DIO truly submitted for IO, so
to make this less hard on ourselves and less combersome we need to lock the
extents as we try to map them, and then we unlock any areas we didn't
actually map.  This makes us completely safe from deadlocks and reliance on
a particular behavior of the DIO code.  This also lays the groundwork for
allowing us to use the normal csum storage method for reads which means we
can remove an allocation.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-08-28 16:53:27 -04:00
Dan Carpenter dadd1105ca Btrfs: fix some endian bugs handling the root times
"trans->transid" is cpu endian but we want to store the data as little
endian.  "item->ctime.nsec" is only 32 bits, not 64.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-08-28 16:53:26 -04:00
Dan Carpenter 55e591ffde Btrfs: unlock on error in btrfs_delalloc_reserve_metadata()
We should release this mutex before returning the error code.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-08-28 16:53:25 -04:00
Dan Carpenter 57a5a88203 Btrfs: checking for NULL instead of IS_ERR
add_qgroup_rb() never returns NULL, only error pointers.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-08-28 16:53:25 -04:00
Dan Carpenter 5986802c2f Btrfs: fix some error codes in btrfs_qgroup_inherit()
These are returning zero when it should be returning a negative error
code.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-08-28 16:53:24 -04:00
Stefan Behrens aa2ffd0616 Btrfs: fix a misplaced address operator in a condition
This should obviously not be "if (&flag)" but "if (flag)".

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
2012-08-28 16:53:23 -04:00
Linus Torvalds 89a897fbd8 Pull request from git://github.com/prasad-joshi/logfs_upstream.git
There are few important bug fixes for LogFS
 
 9f0bbd8 logfs: query block device for number of pages to send with bio
 	This BUG was found when LogFS was used on KVM. The patch fixes
 	the problem by asking for underlaying block device the number
 	of pages to send with each BIO.
 
 41b93bc logfs: maintain the ordering of meta-inode destruction
 	LogFS maintains file system meta-data in special inodes. These
 	inodes are releated to each other, therefore they must be
 	destroyed in a proper order.
 
 ddb24bb logfs: create a pagecache page if it is not present
 
 cd8bfa9 logfs: initialize the number of iovecs in bio
 	LogFS used to panic when it was created on an encrypted LVM
 	volume. The patch fixes the problem by properly initializing
 	the BIO.
 
 d2dcd90 logfs: destroy the reserved inodes while unmounting
 
 Diffstat:
  fs/logfs/dev_bdev.c  |   15 ++++++++-------
  fs/logfs/inode.c     |   18 +-----------------
  fs/logfs/journal.c   |    2 +-
  fs/logfs/readwrite.c |    1 +
  fs/logfs/segment.c   |    2 +-
  5 files changed, 12 insertions(+), 26 deletions(-)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQNmnnAAoJEDFA/f+3K+ZNhc0QAJWWtcvqCIU2UHKw0DaQieH8
 X+YwoqdH5UBR0vylyoHCgCRSoRuttYEy47DYDhkpUdqad60atDhwIwhFaknlNZbt
 +hv2ve6hZTzcs42HZhu+0WUzMkUs0s6Xjuu9JlZND59zk1wWBfttDlKI2/SdjMlm
 GnW0nx5gLFZ1rmpaL8bdRpyLXxUvb9FmUc+YWsuinj8A2Lnqg5bNOmJZ3CiMYNVk
 UvbHDYJmNaMZndbYeXZtxXtUo9Uk4HsU+7whpmgD+OPz7h+VMOgfXnwkpsgmtbY2
 qTXAjyFVpN23zhBTbpCMXvpfbrffdBJBfkEW+2sXqpr9IHbMX0Y9cb/XJ3Ub1Qz+
 HRLdBh0iAZewWVRsGOKG2OU1WUrTNzKUAJ795QFL0c0bZsODzQ+5OlcD7rBzEMpq
 ioN49UtJWlmckWaott6PinSl/OKWlvz771Zayh9+ttuL3Dvt6coV7K3ns5MI6glN
 M9DTMd8GAW2Kdz80EyAjZYWw2M3lbs0/GghJB4ozYg3mrDpBq1w5ouEvSNw8PuJw
 yoUEe2xLGpLZdzQDouKMlrai8kohCyMoHKH1Kimx+iG4LZkYLy8VJepY1mfzQbEY
 1zSB1nCV/c67qI3fxRdPNmJ7YtjTC9ero0qOouXwSJWwZ+OEikGXOKWnjRutcY1C
 Jfbfg6gNqOrZZbYWp1ke
 =eS/c
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://github.com/prasad-joshi/logfs_upstream

Pull LogFS bugfixes from Prasad Joshi:

 - "logfs: query block device for number of pages to send with bio"

	This BUG was found when LogFS was used on KVM. The patch fixes
	the problem by asking for underlaying block device the number
	of pages to send with each BIO.

 - "logfs: maintain the ordering of meta-inode destruction"

	LogFS maintains file system meta-data in special inodes. These
	inodes are releated to each other, therefore they must be
	destroyed in a proper order.

 - "logfs: initialize the number of iovecs in bio"

	LogFS used to panic when it was created on an encrypted LVM
	volume. The patch fixes the problem by properly initializing
	the BIO.

Plus a couple more:
 - logfs: create a pagecache page if it is not present
 - logfs: destroy the reserved inodes while unmounting

* tag 'for-linus' of git://github.com/prasad-joshi/logfs_upstream:
  logfs: query block device for number of pages to send with bio
  logfs: maintain the ordering of meta-inode destruction
  logfs: create a pagecache page if it is not present
  logfs: initialize the number of iovecs in bio
  logfs: destroy the reserved inodes while unmounting
2012-08-26 10:14:11 -07:00
Linus Torvalds e1d33a5c63 xfs: bugfixes for 3.6-rc4
- fix uninitialised variable in xfs_rtbuf_get()
 - unlock the AGI buffer when looping in xfs_dialloc
 - check for possible overflow in xfs_ioc_trim
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQN7FMAAoJENaLyazVq6ZOrwsQAMB2Cg/t/XeiPFD5QtvFX7Ua
 9cS3klSo9p29Xr5oeo54b3mF+5dycS61PQu2zdg4XhU5ZqoWoS6xUIFxzo3yWiVi
 nC/BbgMQTgkfEQs+2ulmXRgP8A2DdakSDmvAZDQNURZQwH4zykU0tPKD9V009TMW
 SaIprIWQqhRrsd9prrv364FgKv3/vyuFL1Agho304o3ynk+2SLDzm3eqdTaXTD9Z
 9OGP5Hs5Mh9CZmr46MIpRnoaHppqNRZh7EbDyMFL8kMN8F2yOdJk5MgA1m7gPJ5Z
 ADfnv3laxEp04hhD5lazePYJwZlg/KB77Y4YXD5hmTSznIxcQNlfRcMV9ju4FGgr
 Nb5SgA/6iBk5dkZyuyIdqCzj1Mt6SimpIWb3+/vcS8+VaqQUsyoufccO0zrgeoIA
 /O1PtCe//haFENlD+hMc9QrQSISsaR/YYUv0yz4YcsxKsdmyYpbHdTyFhVjnkKoS
 BtA0ZOgWYebgEzr7J1ijW085CrDvAc9jsPujOWmZlBJWcfo1rtVfqhXWll0ms5cr
 ZRoWqLDqoOhjlUDqwMN4dk6ENxG4QbSl6AWXXVXrUvtvaNzoi0MBNX3HPt/RVmF5
 rGKFvLuIxBHAfD1xYCqtCfP9yzeQzDy1SPdASForEZOyM+RVUnOf9i+1nhbsu/97
 AhiW4lozQetWzx+RQwU+
 =+FuN
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-v3.6-rc4' of git://oss.sgi.com/xfs/xfs

Pull xfs bugfixes from Ben Myers:
 - fix uninitialised variable in xfs_rtbuf_get()
 - unlock the AGI buffer when looping in xfs_dialloc
 - check for possible overflow in xfs_ioc_trim

* tag 'for-linus-v3.6-rc4' of git://oss.sgi.com/xfs/xfs:
  xfs: check for possible overflow in xfs_ioc_trim
  xfs: unlock the AGI buffer when looping in xfs_dialloc
  xfs: fix uninitialised variable in xfs_rtbuf_get()
2012-08-25 11:47:06 -07:00
Linus Torvalds 8497ae61d0 Merge branch 'for-3.6' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from J. Bruce Fields:
 "Particular thanks to Michael Tokarev, Malahal Naineni, and Jamie
  Heilman for their testing and debugging help."

* 'for-3.6' of git://linux-nfs.org/~bfields/linux:
  svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping
  svcrpc: sends on closed socket should stop immediately
  svcrpc: fix BUG() in svc_tcp_clear_pages
  nfsd4: fix security flavor of NFSv4.0 callback
2012-08-25 11:43:41 -07:00
Linus Torvalds a7e546f175 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block-related fixes from Jens Axboe:

 - Improvements to the buffered and direct write IO plugging from
   Fengguang.

 - Abstract out the mapping of a bio in a request, and use that to
   provide a blk_bio_map_sg() helper.  Useful for mapping just a bio
   instead of a full request.

 - Regression fix from Hugh, fixing up a patch that went into the
   previous release cycle (and marked stable, too) attempting to prevent
   a loop in __getblk_slow().

 - Updates to discard requests, fixing up the sizing and how we align
   them.  Also a change to disallow merging of discard requests, since
   that doesn't really work properly yet.

 - A few drbd fixes.

 - Documentation updates.

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: replace __getblk_slow misfix by grow_dev_page fix
  drbd: Write all pages of the bitmap after an online resize
  drbd: Finish requests that completed while IO was frozen
  drbd: fix drbd wire compatibility for empty flushes
  Documentation: update tunable options in block/cfq-iosched.txt
  Documentation: update tunable options in block/cfq-iosched.txt
  Documentation: update missing index files in block/00-INDEX
  block: move down direct IO plugging
  block: remove plugging at buffered write time
  block: disable discard request merge temporarily
  bio: Fix potential memory leak in bio_find_or_create_slab()
  block: Don't use static to define "void *p" in show_partition_start()
  block: Add blk_bio_map_sg() helper
  block: Introduce __blk_segment_map_sg() helper
  fs/block-dev.c:fix performance regression in O_DIRECT writes to md block devices
  block: split discard into aligned requests
  block: reorganize rounding of max_discard_sectors
2012-08-25 11:36:43 -07:00
Linus Torvalds 62688e5b64 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF, ext3 & reiserfs fixes from Jan Kara:
 "A couple of fixes (udf, reiserfs, ext3) that accumulated over my
  vacation."

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: fix retun value on error path in udf_load_logicalvol
  jbd: don't write superblock when unmounting an ro filesystem
  reiserfs: fix deadlocks with quotas
  quota: Move down dqptr_sem read after initializing default warn[] type at __dquot_alloc_space().
  UDF: During mount free lvid_bh before rescanning with different blocksize
  udf: fix udf_setsize() for file data in ICB
2012-08-23 21:56:22 -07:00
Linus Torvalds f4673d6f18 UBIFS fixes for 3.6:
1. Fix crash on error which prevents emulated power-cut testing.
 2. Fix log reply regression introduced in 3.6-rc1.
 3. Fix UBIFS complaints about too small debug buffer size which.
 4. Fix error message spelling, and remove incorrect commentary.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQNdHYAAoJECmIfjd9wqK0t+cQANEkF1Xgs8cd/5U++tQWCGyu
 EInLm+416ytTy4Vg2bLwlJ4OnUhgtTSTeKovtp3yoSKytxB4vf50qI5P/gInXSxo
 mI1KkvHZLraFzFPGtnbXGCg0YS8xZLIhw2p8VDuNFcxZIEiuAGZ2qMC9tPzzzle6
 em/yDIumR//oioFHOqVNu6wKE3do9QY5fo0meqE6QTliKppSDhWcO9MCfodkFOCp
 WtlD2dfIQv3SdaZW+X8skbqMFQ9KJRAzk59hl1Cm2jlXEHEHYmvGwqTCSOk8W/0c
 3ZSgUVW1YKBYq25yA3Xdz6aDCgRT6LYsi/EH0VNNMlt/H9RfcCNTGe96ioTcgsOT
 FroLZZUK9PiQYmqUqnk8eqmUyq888vKZY3Sbxt5/bbZhSoK6aYzNgfLlkZNfnHDL
 unzpyPqYq9VahuTgY5ravDclPKCc2p7xeWJYeOE9JtUY6/fVKJh9qERrtVoNP+nD
 LjS89G/tNBn9ZPk8KyXgikJ4JZ7sw30vySmBdGtx+ckzXqKTsDtLznvdSU3/XiJM
 zdqYV3gulgoJtIPXFVuI3i3GeKyXuJlPpjNimuFn+9mNElZfrvs6RcGbQ3Df/3Mc
 omfTrRxS+vRadGC9MZoIMCV27p7+cEynbLLVXtvPebwqiO1tV1fCspdJGjJmkbvL
 14hi7hTYvYCtMR+m8i2+
 =VvYb
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.6-rc3' of git://git.infradead.org/linux-ubifs

Pull UBIFS fixes from Artem Bityutskiy:
 - Fix crash on error which prevents emulated power-cut testing.
 - Fix log reply regression introduced in 3.6-rc1.
 - Fix UBIFS complaints about too small debug buffer size which.
 - Fix error message spelling, and remove incorrect commentary.

* tag 'upstream-3.6-rc3' of git://git.infradead.org/linux-ubifs:
  UBIFS: fix error messages spelling
  UBIFS: fix complaints about too small debug buffer size
  UBIFS: fix replay regression
  UBIFS: fix crash on error path
  UBIFS: remove stale commentary
2012-08-23 21:50:40 -07:00
Tomas Racek a672e1be30 xfs: check for possible overflow in xfs_ioc_trim
If range.start or range.minlen is bigger than filesystem size, return
invalid value error. This fixes possible overflow in BTOBB macro when
passed value was nearly ULLONG_MAX.

Signed-off-by: Tomas Racek <tracek@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-23 14:48:44 -05:00
Christoph Hellwig 7612903099 xfs: unlock the AGI buffer when looping in xfs_dialloc
Also update some commens in the area to make the code easier to read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-23 14:48:32 -05:00
Dave Chinner 0b9e3f6d84 xfs: fix uninitialised variable in xfs_rtbuf_get()
Results in this assert failure in generic/090:

XFS: Assertion failed: *nmap >= 1, file: fs/xfs/xfs_bmap.c, line: 4363
.....
Call Trace:
 [<ffffffff814680db>] xfs_bmapi_read+0x6b/0x370
 [<ffffffff814b64b2>] xfs_rtbuf_get+0x42/0x130
 [<ffffffff814b6f09>] xfs_rtget_summary+0x89/0x120
 [<ffffffff814b7bfe>] xfs_rtallocate_extent_size+0xce/0x340
 [<ffffffff814b89f0>] xfs_rtallocate_extent+0x240/0x290
 [<ffffffff81462c1a>] xfs_bmap_rtalloc+0x1ba/0x340
 [<ffffffff81463a65>] xfs_bmap_alloc+0x35/0x40
 [<ffffffff8146f111>] xfs_bmapi_allocate+0xf1/0x350
 [<ffffffff8146f9de>] xfs_bmapi_write+0x66e/0xa60
 [<ffffffff8144538a>] xfs_iomap_write_direct+0x22a/0x3f0
 [<ffffffff8143707b>] __xfs_get_blocks+0x38b/0x5d0
 [<ffffffff814372d4>] xfs_get_blocks_direct+0x14/0x20
 [<ffffffff811b0081>] do_blockdev_direct_IO+0xf71/0x1eb0
 [<ffffffff811b1015>] __blockdev_direct_IO+0x55/0x60
 [<ffffffff814355ca>] xfs_vm_direct_IO+0x11a/0x1e0
 [<ffffffff8112d617>] generic_file_direct_write+0xd7/0x1b0
 [<ffffffff8143e16c>] xfs_file_dio_aio_write+0x13c/0x320
 [<ffffffff8143e6f2>] xfs_file_aio_write+0x1c2/0x1d0
 [<ffffffff81174a07>] do_sync_write+0xa7/0xe0
 [<ffffffff81175288>] vfs_write+0xa8/0x160
 [<ffffffff81175702>] sys_pwrite64+0x92/0xb0
 [<ffffffff81b68f69>] system_call_fastpath+0x16/0x1b

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-23 14:48:16 -05:00
Hugh Dickins 676ce6d5ca block: replace __getblk_slow misfix by grow_dev_page fix
Commit 91f68c89d8 ("block: fix infinite loop in __getblk_slow")
is not good: a successful call to grow_buffers() cannot guarantee
that the page won't be reclaimed before the immediate next call to
__find_get_block(), which is why there was always a loop there.

Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595:
inode #19278: block 664: comm cc1: unable to read itable block" on console,
which pointed to this commit.

I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs
sometimes fails from a missing header file, under memory pressure on
ppc G5.  I've never seen this on x86, and I've never seen it on 3.5-rc7
itself, despite that commit being in there: bisection pointed to an
irrelevant pinctrl merge, but hard to tell when failure takes between
18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2).

(I've since found such __ext4_get_inode_loc errors in /var/log/messages
from previous weeks: why the message never appeared on console until
yesterday morning is a mystery for another day.)

Revert 91f68c89d8, restoring __getblk_slow() to how it was (plus
a checkpatch nitfix).  Simplify the interface between grow_buffers()
and grow_dev_page(), and avoid the infinite loop beyond end of device
by instead checking init_page_buffers()'s end_block there (I presume
that's more efficient than a repeated call to blkdev_max_block()),
returning -ENXIO to __getblk_slow() in that case.

And remove akpm's ten-year-old "__getblk() cannot fail ... weird"
comment, but that is worrying: are all users of __getblk() really
now prepared for a NULL bh beyond end of device, or will some oops??

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org # 3.0 3.2 3.4 3.5
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-08-23 12:17:36 +02:00
Linus Torvalds f753c4ec15 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph fixes from Sage Weil:
 "Jim's fix closes a narrow race introduced with the msgr changes.  One
  fix resolves problems with debugfs initialization that Yan found when
  multiple client instances are created (e.g., two clusters mounted, or
  rbd + cephfs), another one fixes problems with mounting a nonexistent
  server subdirectory, and the last one fixes a divide by zero error
  from unsanitized ioctl input that Dan Carpenter found."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: avoid divide by zero in __validate_layout()
  libceph: avoid truncation due to racing banners
  ceph: tolerate (and warn on) extraneous dentry from mds
  libceph: delay debugfs initialization until we learn global_id
2012-08-22 09:58:05 -07:00
Linus Torvalds ad746be969 NFS client bugfixes for Linux 3.6
- NFSv3 mounts need to fail if the FSINFO rpc call fails
 - Ensure that the NFS commit cache gets torn down when we unload the
   NFS module.
 - Fix memory scribble issues when interrupting a LAYOUTGET rpc call
 - Fix NFSv4 legacy idmapper regressions
 - Fix issues with the NFSv4 getacl command
 - Fix a regression when using the legacy "mount -t nfs4"
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQNPyKAAoJEGcL54qWCgDyQlAP/AmnLaJmKMPj5P4GCiA/uxMV
 sr0XSSVAv+OgUtRxLNO5PDyunVIIEjwA5nGa2RZEG14+eYwNEeQehaGoaGqoKLhh
 ATOCMvHHEs5qSKLckWsZWflf6U/MLimYS3D/hHVUtP0QWbBMtol3ImABs2ht/VUc
 HW0wQoEG+5mhbhC87d4ku5E+OHQzYeNNmdX2VkKOmT0CwRd/bfsHxlGwMmcHldP2
 TSge+MRhoKNycpy0j7nugHj2pZ12UiX+O8371OlAketnPk0OA+6RNGMyNo49IQAn
 VEN9MFWc4ypH1+hJ0OvhzmUA6Zn2/lyQXUtwM1DcXeBmLI4aClb3bT3G3xKNh09O
 yLEvkh2aUUlmoTSgP7vDK/Ui3QwVnsaJULvg+gywQJG6qTSFgUQfErNKgJmabHQb
 d0EuumlSGnJ7qdC5nmrUp+M/CpbfGh15ax/JnTRGVK7C3jeCm18vc2np+z4EUDp/
 USzrvuBu1B8eI0nQNoht0BeNf7sEJGgFIn8BJzxDakP8buXPMqEvFwA8c1JmMeBX
 E6pbG2jHqPdrTJtxQTpMRqhA0MxbFlVNi5cCs6U7ifiBqbRWEXyNVgAHTwoIxIrn
 ASbL0s8gZH+EMEXNOmPmFiO2NUjBCeSzAjrsTmSpWa13YmGfiroZN8N/iPzLnFf+
 FR3SFUnoJHq7x4uLZoAX
 =lqfh
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.6-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 - NFSv3 mounts need to fail if the FSINFO rpc call fails
 - Ensure that the NFS commit cache gets torn down when we unload the
   NFS module.
 - Fix memory scribble issues when interrupting a LAYOUTGET rpc call
 - Fix NFSv4 legacy idmapper regressions
 - Fix issues with the NFSv4 getacl command
 - Fix a regression when using the legacy "mount -t nfs4"

* tag 'nfs-for-3.6-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv3: Ensure that do_proc_get_root() reports errors correctly
  NFSv4: Ensure that nfs4_alloc_client cleans up on error.
  NFS: return -ENOKEY when the upcall fails to map the name
  NFS: Clear key construction data if the idmap upcall fails
  NFSv4: Don't use private xdr_stream fields in decode_getacl
  NFSv4: Fix the acl cache size calculation
  NFSv4: Fix pointer arithmetic in decode_getacl
  NFS: Alias the nfs module to nfs4
  NFS: Fix a regression when loading the NFS v4 module
  NFSv4.1: Remove a bogus BUG_ON() in nfs4_layoutreturn_done
  pnfs-obj: Better IO pattern in case of unaligned offset
  NFS41: add pg_layout_private to nfs_pageio_descriptor
  pnfs: nfs4_proc_layoutget returns void
  pnfs: defer release of pages in layoutget
  nfs: tear down caches in nfs_init_writepagecache when allocation fails
2012-08-22 09:57:25 -07:00
Linus Torvalds 467e9e51d0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull assorted fixes - mostly vfs - from Al Viro:
 "Assorted fixes, with an unexpected detour into vfio refcounting logics
  (fell out when digging in an analog of eventpoll race in there)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  task_work: add a scheduling point in task_work_run()
  fs: fix fs/namei.c kernel-doc warnings
  eventpoll: use-after-possible-free in epoll_create1()
  vfio: grab vfio_device reference *before* exposing the sucker via fd_install()
  vfio: get rid of vfio_device_put()/vfio_group_get_device* races
  vfio: get rid of open-coding kref_put_mutex
  introduce kref_put_mutex()
  vfio: don't dereference after kfree...
  mqueue: lift mnt_want_write() outside ->i_mutex, clean up a bit
2012-08-22 09:56:06 -07:00
Artem Bityutskiy 69f9025894 UBIFS: fix error messages spelling
Corruptio -> corruption.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-08-22 17:41:09 +03:00
Randy Dunlap 55852635a8 fs: fix fs/namei.c kernel-doc warnings
Fix kernel-doc warnings in fs/namei.c:

Warning(fs/namei.c:360): No description found for parameter 'inode'
Warning(fs/namei.c:672): No description found for parameter 'nd'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc:	Alexander Viro <viro@zeniv.linux.org.uk>
Cc:	linux-fsdevel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-22 10:30:10 -04:00
Al Viro 98022748f6 eventpoll: use-after-possible-free in epoll_create1()
As soon as we'd installed the file into descriptor table, it can
get closed by another thread.  Freeing ep in process...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-08-22 10:26:55 -04:00
Artem Bityutskiy 65b455b123 UBIFS: fix complaints about too small debug buffer size
When debugging is enabled, we use a temporary on-stack buffer for formatting
the key strings like "(11368871, direntry, 0xcd0750)". The buffer size is
32 bytes and sometimes it is not enough to fit the key string - e.g., when
inode numbers are high. This is not fatal, but the key strings are incomplete
and UBIFS complains like this:

	UBIFS assert failed in dbg_snprintf_key at 137 (pid 1)

This is a regression caused by "515315a UBIFS: fix key printing".

Fix the issue by increasing the buffer to 48 bytes.

Reported-by: Michael Hench <michaelhench@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Michael Hench <michaelhench@gmail.com>
Cc: stable@vger.kernel.org [v3.3+]
2012-08-22 11:51:33 +03:00