Commit Graph

17908 Commits (78ef7fab0eb0a5b159842bac89aed74bb0aa7bfe)

Author SHA1 Message Date
Alex Elder 36adecff50 xfs: nothing special about 1-block log sector
There are a number of places where a log sector size of 1 uses
special case code.  The round_up() and round_down() macros
produce the correct result even when the log sector size is 1, and
this eliminates the need for treating this as a special case.

Signed-off-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:12 -05:00
Alex Elder ff30a6221d xfs: encapsulate bbcount validity checking
Define a function that encapsulates checking the validity of a log
block count.

(Updated from previous version--no longer includes error reporting in the
encapsulated validation function.)

Signed-off-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
2010-05-19 09:58:12 -05:00
Alex Elder 5c17f5339f xfs: kill XLOG_SECTOR_ROUND*()
XLOG_SECTOR_ROUNDUP_BBCOUNT() and XLOG_SECTOR_ROUNDDOWN_BLKNO()
are now fairly simple macro translations.  Just get rid of them in
favor of the round_up() and round_down() macro calls they represent.

Also, in spots in xlog_get_bp() and xlog_write_log_records(),
round_up() was being called with value 1, which just evaluates
to the macro's second argument; so just use that instead.
In the latter case, make use of that value, as long as it's
already been computed.

Signed-off-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
2010-05-19 09:58:12 -05:00
Alex Elder 8511998baa xfs: simplify XLOG_SECTOR_ROUND*()
XLOG_SECTOR_ROUNDUP_BBCOUNT() is defined in "fs/xfs/xfs_log_recover.c"
in an overly-complicated way.  It is basically roundup(), but that
is not at all clear from its definition.  (Actually, there is
another macro round_up() that applies for power-of-two-based masks
which I'll be using here.)

The operands in XLOG_SECTOR_ROUNDUP_BBCOUNT() are basically the
block number (bbs) and the log sector basic block mask
(log->l_sectbb_mask).  I'll call them B and M for this discussion.

The macro computes is value this way:
	M && (B & M) ? (B + M + 1) & ~M : B

Put another way, we can break it into 3 cases:
	1)  ! M          -> B			# 0 mask, no effect
	2)  ! (B & M)    -> B			# sector aligned
	3)  M && (B & M) -> (B + M + 1) & ~M	# round up otherwise

The round_up() macro is cleverly defined using a value, v, and a
power-of-2, p, and the result is the nearest multiple of p greater
than or equal to v.  Its value is computed something like this:
	((v - 1) | (p - 1)) + 1
Let's consider using this in the context of the 3 cases above.

When p = 2^0 = 1, the result boils down to ((v - 1) | 0) + 1, so it
just translates any value v to itself.  That handles case (1) above.

When p = 2^n, n > 0, we know that (p - 1) will be a mask with all n
bits 0..n-1 set.  The condition in this case occurs when none of
those mask bits is set in the value v provided.  If that is the
case, subtracting 1 from v will have 1's in all those lower bits (at
least).  Therefore, OR-ing the mask with that decremented value has
no effect, so adding the 1 back again will just translate the v to
itself.  This handles case (2).

Otherwise, the value v is greater than some multiple of p, and
decrementing it will produce a result greater than or equal to that
multiple.  OR-ing in the mask will produce a value 1 less than the
next multiple of p, so finally adding 1 back will result in the
desired rounded-up value.  This handles case (3).

Hopefully this is convincing.

While I was at it, I converted XLOG_SECTOR_ROUNDDOWN_BLKNO() to use
the round_down() macro.

Signed-off-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
2010-05-19 09:58:12 -05:00
Alex Elder 6881a229f6 xfs: fix min bufsize bugs in two places
This fixes a bug in two places that I found by inspection.  In
xlog_find_verify_cycle() and xlog_write_log_records(), the code
attempts to allocate a buffer to hold as many blocks as possible.
It gives up if the number of blocks to be allocated gets too small.
Right now it uses log->l_sectbb_log as that lower bound, but I'm
sure it's supposed to be the actual log sector size instead.  That
is, the lower bound should be (1 << log->l_sectbb_log).

Also define a simple macro xlog_sectbb(log) to represent the number
of basic blocks in a sector for the given log.

(No change from original submission; I have implemented Christoph's
suggestion about storing l_sectsize rather than l_sectbb_log in
a new, separate patch in this series.)

Signed-off-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
2010-05-19 09:58:12 -05:00
Alex Elder a0e856b0b4 xfs: add const qualifiers to xfs error function args
Change the tag and file name arguments to xfs_error_report() and
xfs_corruption_error() to use a const qualifier.

Signed-off-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
2010-05-19 09:58:11 -05:00
Christoph Hellwig 74457cf4a3 xfs: remove xfs_dqmarker
The xfs_dqmarker structure does not need to exist anymore. Move the
remaining flags field out of it and remove the structure altogether.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2010-05-19 09:58:11 -05:00
Dave Chinner 3a8406f6d6 xfs: convert the dquot free list to use list heads
Convert the dquot free list on the filesystem to use listhead
infrastructure rather than the roll-your-own in the quota code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:11 -05:00
Dave Chinner e6a81f13aa xfs: convert the dquot hash list to use list heads
Convert the dquot hash list on the filesystem to use listhead
infrastructure rather than the roll-your-own in the quota code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:11 -05:00
Dave Chinner 368e136174 xfs: remove duplicate code from dquot reclaim
The dquot shaker and the free-list reclaim code use exactly the same
algorithm but the code is duplicated and slightly different in each
case. Make the shaker code use the single dquot reclaim code to
remove the code duplication.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:11 -05:00
Dave Chinner 3a25404b3f xfs: convert the per-mount dquot list to use list heads
Convert the dquot list on the filesytesm to use listhead
infrastructure rather than the roll-your-own in the quota code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:10 -05:00
Dave Chinner 9abbc539bf xfs: add log item recovery tracing
Currently there is no tracing in log recovery, so it is difficult to
determine what is going on when something goes wrong.

Add tracing for log item recovery to provide visibility into the log
recovery process. The tracing added shows regions being extracted
from the log transactions and added to the transaction hash forming
recovery items, followed by the reordering, cancelling and finally
recovery of the items.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:10 -05:00
Christoph Hellwig e6b1f27370 xfs: clean up xlog_write_adv_cnt
Replace the awkward xlog_write_adv_cnt with an inline helper that makes
it more obvious that it's modifying it's paramters, and replace the use
of an integer type for "ptr" with a real void pointer.  Also move
xlog_write_adv_cnt to xfs_log_priv.h as it will be used outside of
xfs_log.c in the delayed logging series.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2010-05-19 09:58:10 -05:00
Dave Chinner 55b66332d0 xfs: introduce new internal log vector structure
The current log IO vector structure is a flat array and not
extensible. To make it possible to keep separate log IO vectors for
individual log items, we need a method of chaining log IO vectors
together.

Introduce a new log vector type that can be used to wrap the
existing log IO vectors on use that internally to the log. This
means that the existing external interface (xfs_log_write) does not
change and hence no changes to the transaction commit code are
required.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2010-05-19 09:58:10 -05:00
Christoph Hellwig 99428ad0f6 xfs: reindent xlog_write
Reindent xlog_write to normal one tab indents and move all variable
declarations into the closest enclosing block.

Split from a bigger patch by Dave Chinner.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2010-05-19 09:58:10 -05:00
Dave Chinner b5203cd0a4 xfs: factor xlog_write
xlog_write is a mess that takes a lot of effort to understand. It is
a mass of nested loops with 4 space indents to get it to fit in 80 columns
and lots of funky variables that aren't obvious what they mean or do.

Break it down into understandable chunks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2010-05-19 09:58:09 -05:00
Dave Chinner 9b9fc2b760 xfs: log ticket reservation underestimates the number of iclogs
When allocation a ticket for a transaction, the ticket is initialised with the
worst case log space usage based on the number of bytes the transaction may
consume. Part of this calculation is the number of log headers required for the
iclog space used up by the transaction.

This calculation makes an undocumented assumption that if the transaction uses
the log header space reservation on an iclog, then it consumes either the
entire iclog or it completes. That is - the transaction that is first in an
iclog is the transaction that the log header reservation is accounted to. If
the transaction is larger than the iclog, then it will use the entire iclog
itself. Document this assumption.

Further, the current calculation uses the rule that we can fit iclog_size bytes
of transaction data into an iclog. This is in correct - the amount of space
available in an iclog for transaction data is the size of the iclog minus the
space used for log record headers. This means that the calculation is out by
512 bytes per 32k of log space the transaction can consume. This is rarely an
issue because maximally sized transactions are extremely uncommon, and for 4k
block size filesystems maximal transaction reservations are about 400kb. Hence
the error in this case is less than the size of an iclog, so that makes it even
harder to hit.

However, anyone using larger directory blocks (16k directory blocks push the
maximum transaction size to approx. 900k on a 4k block size filesystem) or
larger block size (e.g. 64k blocks push transactions to the 3-4MB size) could
see the error grow to more than an iclog and at this point the transaction is
guaranteed to get a reservation underrun and shutdown the filesystem.

Fix this by adjusting the calculation to calculate the correct number of iclogs
required and account for them all up front.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:09 -05:00
Dave Chinner b1c1b5b610 xfs: Clean up xfs_trans_committed code after factoring
Now that the code has been factored, clean up all the remaining
style cruft, simplify the code and re-order functions so that it
doesn't need forward declarations.

Also move the remaining functions that require forward declarations
(xfs_trans_uncommit, xfs_trans_free) so that all the forward
declarations can be removed from the file.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:09 -05:00
Dave Chinner 8e646a55ac xfs: update and factor xfs_trans_committed()
The function header to xfs-trans_committed has long had this
comment:

 * THIS SHOULD BE REWRITTEN TO USE xfs_trans_next_item()

To prepare for different methods of committing items, convert the
code to use xfs_trans_next_item() and factor the code into smaller,
more digestible chunks.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:09 -05:00
Christoph Hellwig a3ccd2ca43 xfs: clean up xfs_trans_commit logic even more
> +shut_us_down:
> +	shutdown = XFS_FORCED_SHUTDOWN(mp) ? EIO : 0;
> +	if (!(tp->t_flags & XFS_TRANS_DIRTY) || shutdown) {
> +		xfs_trans_unreserve_and_mod_sb(tp);
> +		/*

This whole area in _xfs_trans_commit is still a complete mess.

So while touching this code, unravel this mess as well to make the
whole flow of the function simpler and clearer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2010-05-19 09:58:08 -05:00
Dave Chinner 0924378a68 xfs: split out iclog writing from xfs_trans_commit()
Split the the part of xfs_trans_commit() that deals with writing the
transaction into the iclog into a separate function. This isolates the
physical commit process from the logical commit operation and makes
it easier to insert different transaction commit paths without affecting
the existing algorithm adversely.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:08 -05:00
Dave Chinner 713bf88bba xfs: fix reservation release commit flag in xfs_bmap_add_attrfork()
xfs_bmap_add_attrfork() passes XFS_TRANS_PERM_LOG_RES to xfs_trans_commit()
to indicate that the commit should release the permanent log reservation
as part of the commit. This is wrong - the correct flag is
XFS_TRANS_RELEASE_LOG_RES - and it is only by the chance that both these
flags have the value of 0x4 that the code is doing the right thing.

Fix it by changing to use the correct flag.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:08 -05:00
Dave Chinner 8e12385086 xfs: remove stale parameter from ->iop_unpin method
The staleness of a object being unpinned can be directly derived
from the object itself - there is no need to extract it from the
object then pass it as a parameter into IOP_UNPIN().

This means we can kill the XFS_LID_BUF_STALE flag - it is set,
checked and cleared in the same places XFS_BLI_STALE flag in the
xfs_buf_log_item so it is now redundant and hence safe to remove.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:08 -05:00
Dave Chinner 4aaf15d1aa xfs: Add inode pin counts to traces
We don't record pin counts in inode events right now, and this makes
it difficult to track down problems related to pinning inodes. Add
the pin count to the inode trace class and add trace events for
pinning and unpinning inodes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:08 -05:00
Dave Chinner 43f5efc5b5 xfs: factor log item initialisation
Each log item type does manual initialisation of the log item.
Delayed logging introduces new fields that need initialisation, so
factor all the open coded initialisation into a common function
first.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-05-19 09:58:07 -05:00
Jan Engelhardt e2a07812e9 xfs: add blockdev name to kthreads
This allows to see in `ps` and similar tools which kthreads are
allotted to which block device/filesystem, similar to what jbd2
does. As the process name is a fixed 16-char array, no extra
space is needed in tasks.

  PID TTY      STAT   TIME COMMAND
    2 ?        S      0:00 [kthreadd]
  197 ?        S      0:00  \_ [jbd2/sda2-8]
  198 ?        S      0:00  \_ [ext4-dio-unwrit]
  204 ?        S      0:00  \_ [flush-8:0]
 2647 ?        S      0:00  \_ [xfs_mru_cache]
 2648 ?        S      0:00  \_ [xfslogd/0]
 2649 ?        S      0:00  \_ [xfsdatad/0]
 2650 ?        S      0:00  \_ [xfsconvertd/0]
 2651 ?        S      0:00  \_ [xfsbufd/ram0]
 2652 ?        S      0:00  \_ [xfsaild/ram0]
 2653 ?        S      0:00  \_ [xfssyncd/ram0]

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2010-05-19 09:58:07 -05:00
Zhitong Wang fda168c245 xfs: Fix integer overflow in fs/xfs/linux-2.6/xfs_ioctl*.c
The am_hreq.opcount field in the xfs_attrmulti_by_handle() interface
is not bounded correctly. The opcount is used to determine the size
of the buffer required. The size is bounded, but can overflow and so
the size checks may not be sufficient to catch invalid opcounts.
Fix it by catching opcount values that would cause overflows before
calculating the size.

Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2010-05-19 09:58:07 -05:00
David S. Miller 2ec8c6bb5d Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	include/linux/mod_devicetable.h
	scripts/mod/file2alias.c
2010-05-18 23:01:55 -07:00
Joel Becker 18d3a98f3c ocfs2: Silence a gcc warning.
ocfs2_block_group_claim_bits() is never called with min_bits=0, but we
shouldn't leave status undefined if it ever is.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-18 16:48:41 -07:00
Tao Ma 5f5261acb0 ocfs2: Don't retry xattr set in case value extension fails.
In normal xattr set, the set sequence is inode, xattr block
and finally xattr bucket if we meet with a ENOSPC. But there
is a corner case.
So consider we will set a xattr whose value will be stored in
a cluster, and there is no xattr block by now. So we will
reserve 1 xattr block and 1 cluster for setting it. Now if we
fail in value extension(in case the volume is almost full and
we can't allocate the cluster because the check in
ocfs2_test_bg_bit_allocatable), ENOSPC will be returned. So
we will try to create a bucket(this time there is a chance that
the reserved cluster will be used), and when we try value extension
again, kernel bug happens. We did meet with it. Check the bug below.
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1251

This patch just try to avoid this by adding a set_abort in
ocfs2_xattr_set_ctxt, so in case ENOSPC happens in value extension,
we will check whether it is caused by the real ENOSPC or just the
full of inode or xattr block. If it is the first case, we set set_abort
so that we don't try any further. we are safe to exit directly here
ince it is really ENOSPC.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-18 16:41:39 -07:00
Wengang Wang d9ef75221a ocfs2:dlm: avoid dlm->ast_lock lockres->spinlock dependency break
Currently we process a dirty lockres with the lockres->spinlock taken. While
during the process, we may need to lock on dlm->ast_lock. This breaks the
dependency of dlm->ast_lock(lock first) and lockres->spinlock(lock second).

This patch fixes the problem.
Since we can't release lockres->spinlock, we have to take dlm->ast_lock
just before taking the lockres->spinlock and release it after lockres->spinlock
is released. And use __dlm_queue_bast()/__dlm_queue_ast(), the nolock version,
in dlm_shuffle_lists(). There are no too many locks on a lockres, so there is no
performance harm.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-18 16:41:34 -07:00
Tao Ma d5a7df0649 ocfs2: Reset xattr value size after xa_cleanup_value_truncate().
In ocfs2_prepare_xattr_entry, if we fail to grow an existing value,
xa_cleanup_value_truncate() will leave the old entry in place.  Thus, we
reset its value size.  However, if we were allocating a new value, we
must not reset the value size or we will BUG().  This resolves
oss.oracle.com bug 1247.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-18 16:41:21 -07:00
Joel Becker 41841b0bce Merge branch 'discontig-bg' of git://oss.oracle.com/git/tma/linux-2.6 into ocfs2-merge-window 2010-05-18 16:40:42 -07:00
J. Bruce Fields e4e83ea47b Revert "nfsd4: distinguish expired from stale stateids"
This reverts commit 78155ed75f.

We're depending here on the boot time that we use to generate the
stateid being monotonic, but get_seconds() is not necessarily.

We still depend at least on boot_time being different every time, but
that is a safer bet.

We have a few reports of errors that might be explained by this problem,
though we haven't been able to confirm any of them.

But the minor gain of distinguishing expired from stale errors seems not
worth the risk.

Conflicts:

	fs/nfsd/nfs4state.c

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-18 19:03:50 -04:00
Julia Lawall 316ce2ba8e fs/ocfs2/dlm: Use kstrdup
Use kstrdup when the goal of an allocation is copy a string into the
allocated region.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to;
expression flag,E1,E2;
statement S;
@@

-  to = kmalloc(strlen(from) + 1,flag);
+  to = kstrdup(from, flag);
   ... when != \(from = E1 \| to = E1 \)
   if (to==NULL || ...) S
   ... when != \(from = E2 \| to = E2 \)
-  strcpy(to, from);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-18 12:31:11 -07:00
Julia Lawall 3914ed0cec fs/ocfs2/dlm: Drop memory allocation cast
Drop cast on the result of kmalloc and similar functions.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
type T;
@@

- (T *)
  (\(kmalloc\|kzalloc\|kcalloc\|kmem_cache_alloc\|kmem_cache_zalloc\|
   kmem_cache_alloc_node\|kmalloc_node\|kzalloc_node\)(...))
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-18 12:31:10 -07:00
Tristan Ye c1631d4a48 Ocfs2: Optimize punching-hole code.
This patch simplifies the logic of handling existing holes and
skipping extent blocks and removes some confusing comments.

The patch survived the fill_verify_holes testcase in ocfs2-test.
It also passed my manual sanity check and stress tests with enormous
extent records.

Currently punching a hole on a file with 3+ extent tree depth was
really a performance disaster.  It can even take several hours,
though we may not hit this in real life with such a huge extent
number.

One simple way to improve the performance is quite straightforward.
From the logic of truncate, we can punch the hole from hole_end to
hole_start, which reduces the overhead of btree operations in a
significant way, such as tree rotation and moving.

Following is the testing result when punching hole from 0 to file end
in bytes, on a 1G file, 1G file consists of 256k extent records, each record
cover 4k data(just one cluster, clustersize is 4k):

===========================================================================
 * Original punching-hole mechanism:
===========================================================================

   I waited 1 hour for its completion, unfortunately it's still ongoing.

===========================================================================
 * Patched punching-hode mechanism:
===========================================================================

   real 0m2.518s
   user 0m0.000s
   sys  0m2.445s

That means we've gained up to 1000 times improvement on performance in this
case, whee! It's fairly cool. and it looks like that performance gain will
be raising when extent records grow.

The patch was based on my former 2 patches, which were about truncating
codes optimization and fixup to handle CoW on punching hole.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-18 12:31:05 -07:00
Tristan Ye ee149a7c6c Ocfs2: Make ocfs2_find_cpos_for_left_leaf() public.
The original idea to pull ocfs2_find_cpos_for_left_leaf() out of
alloc.c is to benefit punching-holes optimization patch, it however,
can also be referred by other funcs in the future who want to do the
same job.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-18 12:28:13 -07:00
Tristan Ye e8aec068ec Ocfs2: Fix hole punching to correctly do CoW during cluster zeroing.
Based on the previous patch of optimizing truncate, the bugfix for
refcount trees when punching holes can be fairly easy
and straightforward since most of work we should take into account for
refcounting have been completed already in ocfs2_remove_btree_range().

This patch performs CoW for refcounted extents when a hole being punched
whose start or end offset were in the middle of a cluster, which means
partial zeroing of the cluster will be performed soon.

The patch has been tested fixing the following bug:

http://oss.oracle.com/bugzilla/show_bug.cgi?id=1216

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-18 12:27:46 -07:00
Tristan Ye 78f94673d7 Ocfs2: Optimize ocfs2 truncate to use ocfs2_remove_btree_range() instead.
Truncate is just a special case of punching holes(from new i_size to
end), we therefore could take advantage of the existing
ocfs2_remove_btree_range() to reduce the comlexity and redundancy in
alloc.c.  The goal here is to make truncate more generic and
straightforward.

Several functions only used by ocfs2_commit_truncate() will smiply be
removed.

ocfs2_remove_btree_range() was originally used by the hole punching
code, which didn't take refcount trees into account (definitely a bug).
We therefore need to change that func a bit to handle refcount trees.
It must take the refcount lock, calculate and reserve blocks for
refcount tree changes, and decrease refcounts at the end.  We replace 
ocfs2_lock_allocators() here by adding a new func
ocfs2_reserve_blocks_for_rec_trunc() which accepts some extra blocks to
reserve.  This will not hurt any other code using
ocfs2_remove_btree_range() (such as dir truncate and hole punching).

I merged the following steps into one patch since they may be
logically doing one thing, though I know it looks a little bit fat
to review.

1). Remove redundant code used by ocfs2_commit_truncate(), since we're
    moving to ocfs2_remove_btree_range anyway.

2). Add a new func ocfs2_reserve_blocks_for_rec_trunc() for purpose of
    accepting some extra blocks to reserve.

3). Change ocfs2_prepare_refcount_change_for_del() a bit to fit our
    needs.  It's safe to do this since it's only being called by
    truncate.

4). Change ocfs2_remove_btree_range() a bit to take refcount case into
    account.

5). Finally, we change ocfs2_commit_truncate() to call
    ocfs2_remove_btree_range() in a proper way.

The patch has been tested normally for sanity check, stress tests
with heavier workload will be expected.

Based on this patch, fixing the punching holes bug will be fairly easy.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-18 12:25:10 -07:00
Pavel Emelyanov 47cee541a4 nfsd: safer initialization order in find_file()
The alloc_init_file() first adds a file to the hash and then
initializes its fi_inode, fi_id and fi_had_conflict.

The uninitialized fi_inode could thus be erroneously checked by
the find_file(), so move the hash insertion lower.

The client_mutex should prevent this race in practice; however, we
eventually hope to make less use of the client_mutex, so the ordering
here is an accident waiting to happen.

I didn't find whether the same can be true for two other fields,
but the common sense tells me it's better to initialize an object
before putting it into a global hash table :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-18 12:05:20 -04:00
J. Bruce Fields b7299f4439 nfs4: minor callback code simplification, comment
Note the position in the version array doesn't have to match the actual
rpc version number--to me it seems clearer to maintain the distinction.

Also document choice of rpc callback version number, as discussed in
e.g. http://www.ietf.org/mail-archive/web/nfsv4/current/msg07985.html
and followups.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-18 11:51:38 -04:00
Linus Torvalds b8ae30ee26 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (49 commits)
  stop_machine: Move local variable closer to the usage site in cpu_stop_cpu_callback()
  sched, wait: Use wrapper functions
  sched: Remove a stale comment
  ondemand: Make the iowait-is-busy time a sysfs tunable
  ondemand: Solve a big performance issue by counting IOWAIT time as busy
  sched: Intoduce get_cpu_iowait_time_us()
  sched: Eliminate the ts->idle_lastupdate field
  sched: Fold updating of the last_update_time_info into update_ts_time_stats()
  sched: Update the idle statistics in get_cpu_idle_time_us()
  sched: Introduce a function to update the idle statistics
  sched: Add a comment to get_cpu_idle_time_us()
  cpu_stop: add dummy implementation for UP
  sched: Remove rq argument to the tracepoints
  rcu: need barrier() in UP synchronize_sched_expedited()
  sched: correctly place paranioa memory barriers in synchronize_sched_expedited()
  sched: kill paranoia check in synchronize_sched_expedited()
  sched: replace migration_thread with cpu_stop
  stop_machine: reimplement using cpu_stop
  cpu_stop: implement stop_cpu[s]()
  sched: Fix select_idle_sibling() logic in select_task_rq_fair()
  ...
2010-05-18 08:27:54 -07:00
Linus Torvalds fd25a1f556 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (23 commits)
  cifs: fix noserverino handling when unix extensions are enabled
  cifs: don't update uniqueid in cifs_fattr_to_inode
  cifs: always revalidate hardlinked inodes when using noserverino
  [CIFS] drop quota operation stubs
  cifs: propagate cifs_new_fileinfo() error back to the caller
  cifs: add comments explaining cifs_new_fileinfo behavior
  cifs: remove unused parameter from cifs_posix_open_inode_helper()
  [CIFS] Remove unused cifs_oplock_cachep
  cifs: have decode_negTokenInit set flags in server struct
  cifs: break negotiate protocol calls out of cifs_setup_session
  cifs: eliminate "first_time" parm to CIFS_SessSetup
  [CIFS] Fix lease break for writes
  cifs: save the dialect chosen by server
  cifs: change && to ||
  cifs: rename "extended_security" to "global_secflags"
  cifs: move tcon find/create into separate function
  cifs: move SMB session creation code into separate function
  cifs: track local_nls in volume info
  [CIFS] Allow null nd (as nfs server uses) on create
  [CIFS] Fix losing locks during fork()
  ...
2010-05-18 07:18:38 -07:00
James Morris 539c99fd7f Merge branch 'next' into for-linus 2010-05-18 08:57:00 +10:00
Jeff Layton 4065c802da cifs: fix noserverino handling when unix extensions are enabled
The uniqueid field sent by the server when unix extensions are enabled
is currently used sometimes when it shouldn't be. The readdir codepath
is correct, but most others are not. Fix it.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-17 20:59:21 +00:00
Jeff Layton 84f30c66c3 cifs: don't update uniqueid in cifs_fattr_to_inode
We use this value to find an inode within the hash bucket, so we can't
change this without re-hashing the inode. For now, treat this value
as immutable.

Eventually, we should probably use an inode number change on a path
based operation to indicate that the lookup cache is invalid, but that's
a bit more code to deal with.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-17 20:57:27 +00:00
Jeff Layton db19272edc cifs: always revalidate hardlinked inodes when using noserverino
The old cifs_revalidate logic always revalidated hardlinked inodes.
This hack allowed CIFS to pass some connectathon tests when server inode
numbers aren't used (basic test7, in particular).

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-05-17 20:55:58 +00:00
Linus Torvalds 7d32c0aca4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs
* git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs:
  logfs: handle powerfail on NAND flash
  logfs: handle errors from get_mtd_device()
  logfs: remove unused variable
  logfs: fix sync
  logfs: fix compile failure
  logfs: initialize li->li_refcount
  logfs: commit reservations under space pressure
  logfs: survive logfs_buf_recover read errors
  logfs: Close i_ino reuse race
  logfs: fix logfs_seek_hole()
  logfs: Return -EINVAL if filesystem image doesn't match
  LogFS: Fix typo in b6349ac8
  logfs: testing the wrong variable
2010-05-17 13:53:35 -07:00
Frederic Weisbecker c2f980500a procfs: Kill the bkl in ioctl
There are no more users of procfs that implement the ioctl
callback. Drop the bkl from this path and warn on any use
of this callback.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
2010-05-17 03:06:24 +02:00
Simon Arlott e0f43752a9 bridge: update sysfs link names if port device names have changed
Links for each port are created in sysfs using the device
name, but this could be changed after being added to the
bridge.

As well as being unable to remove interfaces after this
occurs (because userspace tools don't recognise the new
name, and the kernel won't recognise the old name), adding
another interface with the old name to the bridge will
cause an error trying to create the sysfs link.

This fixes the problem by listening for NETDEV_CHANGENAME
notifications and renaming the link.

https://bugzilla.kernel.org/show_bug.cgi?id=12743

Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15 23:10:15 -07:00
Linus Torvalds 18e41da89d Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: check for read permission on src file in the clone ioctl
2010-05-15 12:55:31 -07:00
Dan Rosenberg 5dc6416414 Btrfs: check for read permission on src file in the clone ioctl
The existing code would have allowed you to clone a file that was
only open for writing

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-05-15 12:05:50 -04:00
Linus Torvalds 3f8bf8f0fd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  JFS: Free sbi memory in error path
  fs/sysv: dereferencing ERR_PTR()
  Fix double-free in logfs
  Fix the regression created by "set S_DEAD on unlink()..." commit
2010-05-15 09:03:15 -07:00
Jan Blunck 684bdc7ff9 JFS: Free sbi memory in error path
I spotted the missing kfree() while removing the BKL.

[akpm@linux-foundation.org: avoid multiple returns so it doesn't happen again]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-15 07:16:34 -04:00
Dan Carpenter 404e781249 fs/sysv: dereferencing ERR_PTR()
I moved the dir_put_page() inside the if condition so we don't dereference
"page", if it's an ERR_PTR().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-15 07:16:33 -04:00
Al Viro 265624495f Fix double-free in logfs
iput() is needed *until* we'd done successful d_alloc_root()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-15 07:16:33 -04:00
Al Viro d83c49f3e3 Fix the regression created by "set S_DEAD on unlink()..." commit
1) i_flags simply doesn't work for mount/unlink race prevention;
we may have many links to file and rm on one of those obviously
shouldn't prevent bind on top of another later on.  To fix it
right way we need to mark _dentry_ as unsuitable for mounting
upon; new flag (DCACHE_CANT_MOUNT) is protected by d_flags and
i_mutex on the inode in question.  Set it (with dont_mount(dentry))
in unlink/rmdir/etc., check (with cant_mount(dentry)) in places
in namespace.c that used to check for S_DEAD.  Setting S_DEAD
is still needed in places where we used to set it (for directories
getting killed), since we rely on it for readdir/rmdir race
prevention.

2) rename()/mount() protection has another bogosity - we unhash
the target before we'd checked that it's not a mountpoint.  Fixed.

3) ancient bogosity in pivot_root() - we locked i_mutex on the
right directory, but checked S_DEAD on the different (and wrong)
one.  Noticed and fixed.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-15 07:16:33 -04:00
Pavel Emelyanov 15ddb4aec5 NFSD: don't report compiled-out versions as present
The /proc/fs/nfsd/versions file calls nfsd_vers() to check whether
the particular nfsd version is present/available. The problem is
that once I turn off e.g. NFSD-V4 this call returns -1 which is
true from the callers POV which is wrong.

The proposal is to report false in that case.

The bug has existed since 6658d3a7bb "[PATCH] knfsd: remove
nfsd_versbits as intermediate storage for desired versions".

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: stable@kernel.org
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-14 18:46:14 -04:00
Trond Myklebust 9c7e7e2337 NFS: Don't call iput() in nfs_access_cache_shrinker
iput() can potentially attempt to allocate memory, so we should avoid
calling it in a memory shrinker. Instead, rely on the fact that iput() will
call nfs_access_zap_cache().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:36 -04:00
Trond Myklebust 1a81bb8a1f NFS: Clean up nfs_access_zap_cache()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:35 -04:00
Trond Myklebust 61d5eb2985 NFS: Don't run nfs_access_cache_shrinker() when the mask is GFP_NOFS
Both iput() and put_rpccred() might allocate memory under certain
circumstances, so make sure that we don't recurse and deadlock...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:35 -04:00
Trond Myklebust 93870d76fe NFS: Read requests can use GFP_KERNEL.
There is no danger of deadlock should the allocation trigger page
writeback.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:34 -04:00
Trond Myklebust 18eb884282 NFS: Clean up nfs_create_request()
There is no point in looping if we're out of memory.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:34 -04:00
Trond Myklebust 8535b2be51 NFSv4: Don't use GFP_KERNEL allocations in state recovery
We do not want to have the state recovery thread kick off and wait for a
memory reclaim, since that may deadlock when the writebacks end up
waiting for the state recovery thread to complete.

The safe thing is therefore to use GFP_NOFS in all open, close,
delegation return, lock, etc. operations that may be called by the
state recovery thread.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:33 -04:00
Chuck Lever 9bc4e3ca46 NFS: Calldata for nfs4_renew_done()
I'm about to change task->tk_start from a jiffies value to a ktime_t
value in order to make RPC RTT reporting more precise.

Recently (commit dc96aef9) nfs4_renew_done() started to reference
task->tk_start so that a jiffies value no longer had to be passed
from nfs4_proc_async_renew().  This allowed the calldata to point to
an nfs_client instead.

Changing task->tk_start to a ktime_t value makes it effectively
useless for renew timestamps, so we need to restore the pre-dc96aef9
logic that provided a jiffies "start" timestamp to nfs4_renew_done().

Both an nfs_client pointer and a timestamp need to be passed to
nfs4_renew_done(), so create a new nfs_renewdata structure that
contains both, resembling what is already done for delegreturn,
lock, and unlock.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:32 -04:00
Chuck Lever dfe52c0419 NFS: Squelch compiler warning in nfs_add_server_stats()
Clean up:

fs/nfs/iostat.h: In function ‘nfs_add_server_stats’:
fs/nfs/iostat.h:41: warning: comparison between signed and unsigned integer expressions
fs/nfs/iostat.h:41: warning: comparison between signed and unsigned integer expressions
fs/nfs/iostat.h:41: warning: comparison between signed and unsigned integer expressions
fs/nfs/iostat.h:41: warning: comparison between signed and unsigned integer expressions

Commit fce22848 replaced the open-coded per-cpu logic in several
functions in fs/nfs/iostat.h with a single invocation of
this_cpu_ptr().  This macro assumes its second argument is signed,
not unsigned.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:31 -04:00
Chuck Lever a6d5ff64ba NFS: Clean up fscache_uniq mount option
Clean up: fscache_uniq takes a string, so it should be included
with the other string mount option definitions, by convention.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:31 -04:00
Chuck Lever 0f15c53d5b NFS: Squelch compiler warning
Seen with -Wextra:

/home/cel/linux/fs/nfs/fscache.c: In function ‘__nfs_readpages_from_fscache’:
/home/cel/linux/fs/nfs/fscache.c:479: warning: comparison between signed and unsigned integer expressions

The comparison implicitly converts "int" to "unsigned", making it
safe.  But there's no need for the implicit type conversions here, and
the dfprintk() already uses a "%u" formatter for "npages."  Better to
reduce confusion.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:31 -04:00
Trond Myklebust bb8b27e504 NFSv4: Clean up the NFSv4 setclientid operation
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:30 -04:00
Trond Myklebust d7cf8dd012 NFSv4: Allow attribute caching with 'noac' mounts if client holds a delegation
If the server has given us a delegation on a file, we _know_ that we can
cache the attribute information even when the user has specified 'noac'.

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:30 -04:00
Trond Myklebust fd86dfd263 NFSv4: Fix up the documentation for nfs_do_refmount
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:29 -04:00
Trond Myklebust 1b4c6065b9 NFS: Replace nfsroot on-stack filehandle
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:28 -04:00
Trond Myklebust b157b06ca2 NFS: Cleanup file handle allocations in fs/nfs/super.c
Use the new helper functions instead of open coding.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:28 -04:00
Trond Myklebust ce587e07ba NFS: Prevent the mount code from looping forever on broken exports
Keep a global count of how many referrals that the current task has
traversed on a path lookup. Return ELOOP if the count exceeds
MAX_NESTED_LINKS.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:28 -04:00
Trond Myklebust 6e94d62993 NFS: Reduce stack footprint of nfs3_proc_getacl() and nfs3_proc_setacl()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:28 -04:00
Trond Myklebust ca7e9a0df2 NFS: Reduce stack footprint of nfs_statfs()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:27 -04:00
Trond Myklebust 987f8dfc98 NFS: Reduce stack footprint of nfs_setattr()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:27 -04:00
Trond Myklebust 0ab64e0e14 NFS: Reduce stack footprint of nfs4_proc_create()
Move the O_EXCL open handling into _nfs4_do_open() where it belongs. Doing
so also allows us to reuse the struct fattr from the opendata.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:27 -04:00
Trond Myklebust 23a306120f NFS: Reduce the stack footprint of nfs_proc_symlink()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:27 -04:00
Trond Myklebust eb872f0c8e NFS: Reduce the stack footprint of nfs_proc_create
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:26 -04:00
Trond Myklebust 39967ddf19 NFS: Reduce the stack footprint of nfs_rmdir
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:26 -04:00
Trond Myklebust d346890bea NFS: Reduce stack footprint of nfs_proc_remove()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:26 -04:00
Trond Myklebust 3b14d6542d NFS: Reduce stack footprint of nfs3_proc_readlink()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:25 -04:00
Trond Myklebust 136f2627c9 NFS: Reduce the stack footprint of nfs_link()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:25 -04:00
Trond Myklebust aa49b4cf7d NFS: Reduce stack footprint of nfs_readdir()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:25 -04:00
Trond Myklebust 011fff7239 NFS: Reduce stack footprint of nfs3_proc_rename() and nfs4_proc_rename()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:25 -04:00
Trond Myklebust a3cba2aad9 NFS: Reduce stack footprint of nfs_revalidate_inode()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:24 -04:00
Trond Myklebust c407d41a16 NFSv4: Reduce stack footprint of nfs4_proc_access() and nfs3_proc_access()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:24 -04:00
Trond Myklebust 4f727296d2 NFSv4: Reduce the stack footprint of nfs4_remote_referral_get_sb
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:23 -04:00
Trond Myklebust 8bac9db9cf NFSv4: Reduce stack footprint of nfs4_get_root()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:23 -04:00
Trond Myklebust 04ffdbe2e6 NFS: Reduce the stack footprint of nfs_follow_remote_path()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:23 -04:00
Trond Myklebust e1fb4d05d5 NFS: Reduce the stack footprint of nfs_lookup
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:23 -04:00
Trond Myklebust 364d015e52 NFSv4: Reduce the stack footprint of try_location()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:22 -04:00
Trond Myklebust fbca779a8d NFS: Reduce the stack footprint of nfs_create_server
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:22 -04:00
Trond Myklebust a4d7f16806 NFS: Reduce the stack footprint of nfs_follow_mountpoint()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:22 -04:00
Trond Myklebust 815409d22d NFSv4: Eliminate nfs4_path_walk()
All we really want is the ability to retrieve the root file handle. We no
longer need the ability to walk down the path, since that is now done in
nfs_follow_remote_path().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:21 -04:00
Trond Myklebust 2d36bfde85 NFS: Add helper functions for allocating filehandles and fattr structs
NFS Filehandles and struct fattr are really too large to be allocated on
the stack. This patch adds in a couple of helper functions to allocate them
dynamically instead.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:21 -04:00
Linus Torvalds 4fc4c3ce0d Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
  inotify: don't leak user struct on inotify release
  inotify: race use after free/double free in inotify inode marks
  inotify: clean up the inotify_add_watch out path
  Inotify: undefined reference to `anon_inode_getfd'

Manual merge to remove duplicate "select ANON_INODES" from Kconfig file
2010-05-14 11:49:42 -07:00
Pavel Emelyanov b3b38d842f inotify: don't leak user struct on inotify release
inotify_new_group() receives a get_uid-ed user_struct and saves the
reference on group->inotify_data.user.  The problem is that free_uid() is
never called on it.

Issue seem to be introduced by 63c882a0 (inotify: reimplement inotify
using fsnotify) after 2.6.30.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Eric Paris <eparis@parisplace.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-05-14 11:53:36 -04:00