Commit Graph

17 Commits (fa30bd058b746c0e2318a77ff8b4977faa924c2c)

Author SHA1 Message Date
Mingming Cao b47b24781c [PATCH] ext3_get_blocks: multiple block allocation
Add support for multiple block allocation in ext3-get-blocks().

Look up the disk block mapping and count the total number of blocks to
allocate, then pass it to ext3_new_block(), where the real block allocation is
performed.  Once multiple blocks are allocated, prepare the branch with those
just allocated blocks info and finally splice the whole branch into the block
mapping tree.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Mingming Cao 89747d369d [PATCH] ext3_get_blocks: Mapping multiple blocks at a once
Currently ext3_get_block() only maps or allocates one block at a time.  This
is quite inefficient for sequential IO workload.

I have posted a early implements a simply multiple block map and allocation
with current ext3.  The basic idea is allocating the 1st block in the existing
way, and attempting to allocate the next adjacent blocks on a best effort
basis.  More description about the implementation could be found here:
http://marc.theaimsgroup.com/?l=ext2-devel&m=112162230003522&w=2

The following the latest version of the patch: break the original patch into 5
patches, re-worked some logicals, and fixed some bugs.  The break ups are:

 [patch 1] Adding map multiple blocks at a time in ext3_get_blocks()
 [patch 2] Extend ext3_get_blocks() to support multiple block allocation
 [patch 3] Implement multiple block allocation in ext3-try-to-allocate
 (called via ext3_new_block()).
 [patch 4] Proper accounting updates in ext3_new_blocks()
 [patch 5] Adjust reservation window size properly (by the given number
 of blocks to allocate) before block allocation to increase the
 possibility of allocating multiple blocks in a single call.

Tests done so far includes fsx,tiobench and dbench.  The following numbers
collected from Direct IO tests (1G file creation/read) shows the system time
have been greatly reduced (more than 50% on my 8 cpu system) with the patches.

 1G file DIO write:
 	2.6.15		2.6.15+patches
 real    0m31.275s	0m31.161s
 user    0m0.000s	0m0.000s
 sys     0m3.384s	0m0.564s

 1G file DIO read:
 	2.6.15		2.6.15+patches
 real    0m30.733s	0m30.624s
 user    0m0.000s	0m0.004s
 sys     0m0.748s	0m0.380s

Some previous test we did on buffered IO with using multiple blocks allocation
and delayed allocation shows noticeable improvement on throughput and system
time.

This patch:

Add support of mapping multiple blocks in one call.

This is useful for DIO reads and re-writes (where blocks are already
allocated), also is in line with Christoph's proposal of using getblocks() in
mpage_readpage() or mpage_readpages().

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:00 -08:00
NeilBrown 2ff28e22bd [PATCH] Make address_space_operations->invalidatepage return void
The return value of this function is never used, so let's be honest and
declare it as void.

Some places where invalidatepage returned 0, I have inserted comments
suggesting a BUG_ON.

[akpm@osdl.org: JBD BUG fix]
[akpm@osdl.org: rework for git-nfs]
[akpm@osdl.org: don't go BUG in block_invalidate_page()]
Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:55 -08:00
Arjan van de Ven 9746151861 [PATCH] convert ext3's truncate_sem to a mutex
ext3's truncate_sem is always released in the same function it's taken
and it otherwise is a mutex as well..

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:14 -08:00
Andrew Morton d8733c2956 [PATCH] ext3_readdir: use generic readahead
Linus points out that ext3_readdir's readahead only cuts in when
ext3_readdir() is operating at the very start of the directory.  So for large
directories we end up performing no readahead at all and we suck.

So take it all out and use the core VM's page_cache_readahead().  This means
that ext3 directory reads will use all of readahead's dynamic sizing goop.

Note that we're using the directory's filp->f_ra to hold the readahead state,
but readahead is actually being performed against the underlying blockdev's
address_space.  Fortunately the readahead code is all set up to handle this.

Tested with printk.  It works.  I was struggling to find a real workload which
actually cared.

(The patch also exports page_cache_readahead() to GPL modules)

Cc: "Stephen C. Tweedie" <sct@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:09 -08:00
Badari Pulavarty cd6ef84e6a [PATCH] ext3: fix nobh mode for chattr +j inodes
One can do "chattr +j" on a file to change its journalling mode.  Fix
writeback mode with "nobh" handling for it.

Even though, we mount ext3 filesystem in writeback mode with "nobh" option,
some one can do "chattr +j" on a single file to force it to do journalled
mode.  In order to do journaling, ext3_block_truncate_page() need to
fallback to default case of creating buffers and adding them to transaction
etc.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-11 09:19:34 -08:00
Christoph Lameter e965f9630c [PATCH] Direct Migration V9: Avoid writeback / page_migrate() method
Migrate a page with buffers without requiring writeback

This introduces a new address space operation migratepage() that may be used
by a filesystem to implement its own version of page migration.

A version is provided that migrates buffers attached to pages.  Some
filesystems (ext2, ext3, xfs) are modified to utilize this feature.

The swapper address space operation are modified so that a regular
migrate_page() will occur for anonymous pages without writeback (migrate_pages
forces every anonymous page to have a swap entry).

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:17 -08:00
Denis Lunev ab4eb43ce7 [PATCH] ext3: journal handling on error path in ext3_journalled_writepage()
This patch fixes lost referrence on ext3 current handle in
ext3_journalled_writepage().

Signed-Off-By: Denis Lunev <den@sw.ru>
Cc: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-13 18:14:15 -08:00
Glauber de Oliveira Costa 2973dfdb87 [PATCH] Test for sb_getblk return value
This patch adds tests for the return value of sb_getblk() in the ext2/3
filesystems.  In fs/buffer.c it is stated that the getblk() function never
fails.  However, it does can return NULL in some situations due to I/O
errors, which may lead us to NULL pointer dereferences

Signed-off-by: Glauber de Oliveira Costa <glommer@br.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:26 -08:00
Glauber de Oliveira Costa 5b11687924 [PATCH] Locking problems while EXT3FS_DEBUG on
I noticed some problems while running ext3 with the debug flag set on.
More precisely, I was unable to umount the filesystem.  Some investigation
took me to the patch that follows.

At a first glance , the lock/unlock I've taken out seems really not
necessary, as the main code (outside debug) does not lock the super.  The
only additional danger operations that debug code introduces seems to be
related to bitmap, but bitmap operations tends to be all atomic anyway.

I also took the opportunity to fix 2 spelling errors.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:23 -08:00
Al Viro 27496a8c67 [PATCH] gfp_t: fs/*
- ->releasepage() annotated (s/int/gfp_t), instances updated
 - missing gfp_t in fs/* added
 - fixed misannotation from the original sweep caught by bitwise checks:
   XFS used __nocast both for gfp_t and for flags used by XFS allocator.
   The latter left with unsigned int __nocast; we might want to add a
   different type for those but for now let's leave them alone.  That,
   BTW, is a case when __nocast use had been actively confusing - it had
   been used in the same code for two different and similar types, with
   no way to catch misuses.  Switch of gfp_t to bitwise had caught that
   immediately...

One tricky bit is left alone to be dealt with later - mapping->flags is
a mix of gfp_t and error indications.  Left alone for now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:47 -07:00
Mark Fasheh fef266580e [PATCH] update filesystems for new delete_inode behavior
Update the file systems in fs/ implementing a delete_inode() callback to
call truncate_inode_pages().  One implementation note: In developing this
patch I put the calls to truncate_inode_pages() at the very top of those
filesystems delete_inode() callbacks in order to retain the previous
behavior.  I'm guessing that some of those could probably be optimized.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:27 -07:00
Andreas Gruenbacher ff87b37da9 [PATCH] ext3 xattr: Don't write to the in-inode xattr space of reserved inodes
We are not using the in-inode space for xattrs in reserved inodes because
mkfs.ext3 doesn't initialize it properly.  For those inodes, we set
i_extra_isize to 0.  Make sure that we also don't overwrite the
i_extra_isize field when writing out the inode in that case.  This is for
cleanliness only, and doesn't fix an actual bug.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:46 -07:00
Jan Kara 1f54587bea [PATCH] quota: ext3: Improve quota credit estimates
Use improved credits estimates for quota operations.  Also reserve a space
for a quota operation in a transaction only if filesystem was mounted with
some quota options.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:20 -07:00
Andrew Morton d17d7fa44d [PATCH] revert ext3-writepages-support-for-writeback-mode
This had a fatal lock ranking bug: we do journal_start outside
mpage_writepages()'s lock_page().

Revert the whole thing, think again.

Credit-to: Jan Kara <jack@suse.cz>

For identifying the bug.

Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:44 -07:00
Mingming Cao fe55c45236 [PATCH] ext3: remove unnecessary race then retry in ext3_get_block
The extra race-with-truncate-then-retry logic around
ext3_get_block_handle(), which was inherited from ext2, becomes unecessary
for ext3, since we have already obtained the ei->truncate_sem in
ext3_get_block_handle() before calling ext3_alloc_branch().  The
ei->truncate_sem is already there to block concurrent truncate and block
allocation on the same inode.  So the inode's indirect addressing tree
won't be changed after we grab that semaphore.

We could, after get the semaphore, re-verify the branch is up-to-date or
not.  If it has been changed, then get the updated branch.  If we still
need block allocation, we will have a safe version of the branch to work
with in the ext3_find_goal()/ext3_splice_branch().

The code becomes more readable after remove those retry logic.  The patch
also clean up some gotos in ext3_get_block_handle() to make it more
readable.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:20 -07:00
Linus Torvalds 1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00