Commit graph

18018 commits

Author SHA1 Message Date
Richard Kennedy
2e147f1ef7 fs: inode.c use atomic_inc_return in __iget
Using atomic_inc_return in __iget(struct inode *inode) makes the intent
of this code clearer and generates less code on processors that have
this operation.

On x86_64 this patch reduces the text size of inode.o by 12 bytes.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>

----
patch against 2.6.34-rc7
compiled & tested on x86_64 AMD X2

I've been running with this patch applied for several weeks with no
obvious problems.
regards
Richard
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:21 -04:00
Eric Paris
a7cf4145bb anon_inode: set S_IFREG on the anon_inode
anon_inode_mkinode() sets inode->i_mode = S_IRUSR | S_IWUSR;  This means
that (inode->i_mode & S_IFMT) == 0.  This trips up some SELinux code that
needs to determine if a given inode is a regular file, a directory, etc.
The easiest solution is to just make sure that the anon_inode also sets
S_IFREG.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:20 -04:00
Stephen Hemminger
b7bb0a1291 gfs: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:20 -04:00
Stephen Hemminger
537d81ca7c ocfs: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:20 -04:00
Stephen Hemminger
365f0cb9d2 jffs2: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:20 -04:00
Stephen Hemminger
46e58764f0 xfs: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:19 -04:00
Stephen Hemminger
94d09a98cd reiserfs: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:19 -04:00
Stephen Hemminger
11e2752807 ext4: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:19 -04:00
Stephen Hemminger
d1f21049f9 ext3: constify xattr handlers
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:19 -04:00
Stephen Hemminger
749c72efa4 ext2: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:18 -04:00
Stephen Hemminger
f01cbd3f81 btrfs: constify xattr_handler
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:18 -04:00
Stephen Hemminger
bb4354538e fs: xattr_handler table should be const
The entries in xattr handler table should be immutable (ie const)
like other operation tables.

Later patches convert common filesystems. Uncoverted filesystems
will still work, but will generate a compiler warning.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:18 -04:00
Josef Bacik
18e9e5104f Introduce freeze_super and thaw_super for the fsfreeze ioctl
Currently the way we do freezing is by passing sb>s_bdev to freeze_bdev and then
letting it do all the work.  But freezing is more of an fs thing, and doesn't
really have much to do with the bdev at all, all the work gets done with the
super.  In btrfs we do not populate s_bdev, since we can have multiple bdev's
for one fs and setting s_bdev makes removing devices from a pool kind of tricky.
This means that freezing a btrfs filesystem fails, which causes us to corrupt
with things like tux-on-ice which use the fsfreeze mechanism.  So instead of
populating sb->s_bdev with a random bdev in our pool, I've broken the actual fs
freezing stuff into freeze_super and thaw_super.  These just take the
super_block that we're freezing and does the appropriate work.  It's basically
just copy and pasted from freeze_bdev.  I've then converted freeze_bdev over to
use the new super helpers.  I've tested this with ext4 and btrfs and verified
everything continues to work the same as before.

The only new gotcha is multiple calls to the fsfreeze ioctl will return EBUSY if
the fs is already frozen.  I thought this was a better solution than adding a
freeze counter to the super_block, but if everybody hates this idea I'm open to
suggestions.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:18 -04:00
Al Viro
e1e46bf186 Trim includes in fs/super.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:17 -04:00
Al Viro
d3f2147307 Move grabbing s_umount to callers of grab_super()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:17 -04:00
Al Viro
7ed1ee6118 Take statfs variants to fs/statfs.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:17 -04:00
Al Viro
01a05b337a new helper: iterate_supers()
... and switch the simple "loop over superblocks and do something"
loops to it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:16 -04:00
Al Viro
35cf7ba0b4 Bury __put_super_and_need_restart()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:16 -04:00
Al Viro
79893c17b4 fix prune_dcache()/umount() race
... and get rid of the last __put_super_and_need_restart() caller

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:16 -04:00
Al Viro
df40c01a92 In get_super() and user_get_super() restarts are unconditional
If superblock had been still alive, we would've returned it...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:16 -04:00
Al Viro
1494583de5 fix get_active_super()/umount() race
This one needs restarts...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:15 -04:00
Al Viro
e7fe0585ca fix do_emergency_remount()/umount() races
need list_for_each_entry_safe() here.  Original didn't even
have restart logics, so if you race with umount() it blew up.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:15 -04:00
Al Viro
6754af6464 Convert simple loops over superblocks to list_for_each_entry_safe
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:15 -04:00
Al Viro
8edd64bd60 get rid of restarts in sync_filesystems()
At the same time we can kill s_need_restart and local mutex in there.
__put_super() made public for a while; will be gone later.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:15 -04:00
Al Viro
551de6f34d Leave superblocks on s_list until the end
We used to remove from s_list and s_instances at the same
time.  So let's *not* do the former and skip superblocks
that have empty s_instances in the loops over s_list.

The next step, of course, will be to get rid of rescan logics
in those loops.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:14 -04:00
Al Viro
1712ac8fda Saner locking around deactivate_super()
Make sure that s_umount is acquired *before* we drop the final
active reference; we still have the fast path (atomic_dec_unless)
and we have gotten rid of the window between the moment when
s_active hits zero and s_umount is acquired.  Which simplifies
the living hell out of grab_super() and inotify pin_to_kill()
stuff.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:14 -04:00
Al Viro
b20bd1a5e7 get rid of S_BIAS
use atomic_inc_not_zero(&sb->s_active) instead of playing games with
checking ->s_count > S_BIAS

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:14 -04:00
Al Viro
389b8be6ef get rid of open-coded grab_super() in get_active_super()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:14 -04:00
Al Viro
3981f2e2a0 ceph: should use deactivate_locked_super() on failure exits
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:13 -04:00
Al Viro
2ccde7c631 Clean ecryptfs ->get_sb() up
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:13 -04:00
Al Viro
decabd6650 fix a couple of ecryptfs leaks
First of all, get_sb_nodev() grabs anon dev minor and we
never free it in ecryptfs ->kill_sb().  Moreover, on one
of the failure exits in ecryptfs_get_sb() we leak things -
it happens before we set ->s_root and ->put_super() won't
be called in that case.  Solution: kill ->put_super(), do
all that stuff in ->kill_sb().  And use kill_anon_sb() instead
of generic_shutdown_super() to deal with anon dev leak.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:13 -04:00
Al Viro
894680710d Simplify devpts_get_sb() failure exits
postpone simple_set_mnt() until we know we won't fail.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:12 -04:00
Christoph Hellwig
a135aa2cd7 remove incorrect comment in do_emergency_remount
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:12 -04:00
Al Viro
13e3c5e5b9 clean DCACHE_CANT_MOUNT in d_delete()
We set the "it's dead, don't mount on it" flag _and_ do not remove it if
we turn the damn thing negative and leave it around.  And if it goes
positive afterwards, well...

Fortunately, there's only one place where that needs to be caught:
only d_delete() can turn the sucker negative without immediately freeing
it; all other places that can lead to ->d_iput() call are followed by
unconditionally freeing struct dentry in question.  So the fix is obvious:

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16014
Reported-by: Adam Tkac <vonsch@gmail.com>
Tested-by: Adam Tkac <vonsch@gmail.com>
Cc: <stable@kernel.org>         [2.6.34.x]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:12 -04:00
Linus Torvalds
d515e86e63 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
  udf: BKL ioctl pushdown
2010-05-21 11:17:43 -07:00
Linus Torvalds
7ce1418f95 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (31 commits)
  dquot: Detect partial write error to quota file in write_blk() and add printk_ratelimit for quota error messages
  ocfs2: Fix lock inversion in quotas during umount
  ocfs2: Use __dquot_transfer to avoid lock inversion
  ocfs2: Fix NULL pointer deref when writing local dquot
  ocfs2: Fix estimate of credits needed for quota allocation
  ocfs2: Fix quota locking
  ocfs2: Avoid unnecessary block mapping when refreshing quota info
  ocfs2: Do not map blocks from local quota file on each write
  quota: Refactor dquot_transfer code so that OCFS2 can pass in its references
  quota: unify quota init condition in setattr
  quota: remove sb_has_quota_active in get/set_info
  quota: unify ->set_dqblk
  quota: unify ->get_dqblk
  ext3: make barrier options consistent with ext4
  quota: Make quota stat accounting lockless.
  suppress warning: "quotatypes" defined but not used
  ext3: Fix waiting on transaction during fsync
  jbd: Provide function to check whether transaction will issue data barrier
  ufs: add ufs speciffic ->setattr call
  BKL: Remove BKL from ext2 filesystem
  ...
2010-05-21 10:50:28 -07:00
Jiaying Zhang
1907131bbe dquot: Detect partial write error to quota file in write_blk() and add printk_ratelimit for quota error messages
This patch changes quota_tree.c:write_blk() to detect error caused by partial
write to quota file and add a macro to limit control printed quota error
messages so we won't fill up dmesg with a corrupted quota file.

Signed-off-by: Jiaying Zhang <jiayingz@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:49 +02:00
Jan Kara
c06bcbfa1e ocfs2: Fix lock inversion in quotas during umount
We cannot cancel delayed work from ocfs2_local_free_info because that is called
with dqonoff_mutex held and the work it cancels requires dqonoff_mutex to
finish. Cancel the work before acquiring dqonoff_mutex.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:48 +02:00
Jan Kara
52a9ee281c ocfs2: Use __dquot_transfer to avoid lock inversion
dquot_transfer() acquires own references to dquots via dqget(). Thus it waits
for dq_lock which creates a lock inversion because dq_lock ranks above
transaction start but transaction is already started in ocfs2_setattr(). Fix
the problem by passing own references directly to __dquot_transfer.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:48 +02:00
Jan Kara
741e128933 ocfs2: Fix NULL pointer deref when writing local dquot
commit_dqblk() can write quota info to global file. That is actually a bad
thing to do because if we are just modifying local quota file, we are not
prepared (do not hold proper locks, do not have transaction credits) to do
a modification of the global quota file. So do not use commit_dqblk() and
instead call our writing function directly.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:48 +02:00
Jan Kara
832d09cf14 ocfs2: Fix estimate of credits needed for quota allocation
We were missing reservation of a journal credit for modification of quota
file inode when creating new dquot structure in the global quota file.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:47 +02:00
Jan Kara
fb8dd8d780 ocfs2: Fix quota locking
OCFS2 had three issues with quota locking:
a) When reading dquot from global quota file, we started a transaction while
   holding dqio_mutex which is prone to deadlocks because other paths do it
   the other way around
b) During ocfs2_sync_dquot we were not protected against concurrent writers
   on the same node. Because we first copy data to local buffer, a race
   could happen resulting in old data being written to global quota file and
   thus causing quota inconsistency after a crash.
c) ip_alloc_sem of quota files was acquired while a transaction is started
   in ocfs2_quota_write which can deadlock because we first get ip_alloc_sem
   and then start a transaction when extending quota files.

We fix the problem a) by pulling all necessary code to ocfs2_acquire_dquot
and ocfs2_release_dquot. Thus we no longer depend on generic dquot_acquire
to do the locking and can force proper lock ordering.

Problems b) and c) are fixed by locking i_mutex and ip_alloc_sem of
global quota file in ocfs2_lock_global_qf and removing ip_alloc_sem from
ocfs2_quota_read and ocfs2_quota_write.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:47 +02:00
Jan Kara
ae4f6ef134 ocfs2: Avoid unnecessary block mapping when refreshing quota info
The position of global quota file info does not change. So we do not have
to do logical -> physical block translation every time we reread it from
disk. Thus we can also avoid taking ip_alloc_sem.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:46 +02:00
Jan Kara
f64dd44eb7 ocfs2: Do not map blocks from local quota file on each write
There is no need to map offset of local dquot structure to on disk block
in each quota write. It is enough to map it just once and store the physical
block number in quota structure in memory. Moreover this simplifies locking
as we do not have to take ip_alloc_sem from quota write path.

Acked-by: Joel Becker <Joel.Becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:46 +02:00
Jan Kara
bc8e5f0739 quota: Refactor dquot_transfer code so that OCFS2 can pass in its references
Currently, __dquot_transfer() acquires its own references of dquot structures
that will be put into inode. But for OCFS2, this creates a lock inversion
between dq_lock (waited on in dqget) and transaction start (started in
ocfs2_setattr). Currently, deadlock is impossible because dq_lock is acquired
only during dquot_acquire and dquot_release and we already hold a reference to
dquot structures in ocfs2_setattr so neither of these functions can be called
while we call dquot_transfer. But this is rather subtle and it is hard to teach
lockdep about it. So provide __dquot_transfer function that can be passed dquot
references directly. OCFS2 can then pass acquired dquot references directly to
__dquot_transfer with proper locking.

Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:45 +02:00
Dmitry Monakhov
12755627bd quota: unify quota init condition in setattr
Quota must being initialized if size or uid/git changes requested.
But initialization performed in two different places:
in case of i_size file system is responsible for dquot init
, but in case of uid/gid init will be called internally in
dquot_transfer().
This ambiguity makes code harder to understand.
Let's move this logic to one common helper function.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:45 +02:00
Christoph Hellwig
fcbc59f96e quota: remove sb_has_quota_active in get/set_info
The methods already do these checks, so remove them in the quotactl
implementation to allow non-VFS quota implementations to also support
these calls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:45 +02:00
Christoph Hellwig
c472b43275 quota: unify ->set_dqblk
Pass the larger struct fs_disk_quota to the ->set_dqblk operation so
that the Q_SETQUOTA and Q_XSETQUOTA operations can be implemented
with a single filesystem operation and we can retire the ->set_xquota
operation.  The additional information (RT-subvolume accounting and
warn counts) are left zero for the VFS quota implementation.

Add new fieldmask values for setting the numer of blocks and inodes
values which is required for the VFS quota, but wasn't for XFS.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:44 +02:00
Christoph Hellwig
b9b2dd36c1 quota: unify ->get_dqblk
Pass the larger struct fs_disk_quota to the ->get_dqblk operation so
that the Q_GETQUOTA and Q_XGETQUOTA operations can be implemented
with a single filesystem operation and we can retire the ->get_xquota
operation.  The additional information (RT-subvolume accounting and
warn counts) are left zero for the VFS quota implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:43 +02:00
Eric Sandeen
0636c73ee7 ext3: make barrier options consistent with ext4
ext4 was updated to accept barrier/nobarrier mount options
in addition to the older barrier=0/1.  The barrier story
is complex enough, we should help people by making the options
the same at least, even if the defaults are different.

This patch allows the barrier/nobarrier mount options for ext3,
while keeping nobarrier the default.

It also unconditionally displays barrier status in show_options,
and prints a message at mount time if barriers are not enabled,
just as ext4 does.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-05-21 19:30:41 +02:00