linux/fs
Andreas Dilger 717d50e497 Ext4: Uninitialized Block Groups
In pass1 of e2fsck, every inode table in the fileystem is scanned and checked,
regardless of whether it is in use.  This is this the most time consuming part
of the filesystem check.  The unintialized block group feature can greatly
reduce e2fsck time by eliminating checking of uninitialized inodes.

With this feature, there is a a high water mark of used inodes for each block
group.  Block and inode bitmaps can be uninitialized on disk via a flag in the
group descriptor to avoid reading or scanning them at e2fsck time.  A checksum
of each group descriptor is used to ensure that corruption in the group
descriptor's bit flags does not cause incorrect operation.

The feature is enabled through a mkfs option

	mke2fs /dev/ -O uninit_groups

A patch adding support for uninitialized block groups to e2fsprogs tools has
been posted to the linux-ext4 mailing list.

The patches have been stress tested with fsstress and fsx.  In performance
tests testing e2fsck time, we have seen that e2fsck time on ext3 grows
linearly with the total number of inodes in the filesytem.  In ext4 with the
uninitialized block groups feature, the e2fsck time is constant, based
solely on the number of used inodes rather than the total inode count.
Since typical ext4 filesystems only use 1-10% of their inodes, this feature can
greatly reduce e2fsck time for users.  With performance improvement of 2-20
times, depending on how full the filesystem is.

The attached graph shows the major improvements in e2fsck times in filesystems
with a large total inode count, but few inodes in use.

In each group descriptor if we have

EXT4_BG_INODE_UNINIT set in bg_flags:
        Inode table is not initialized/used in this group. So we can skip
        the consistency check during fsck.
EXT4_BG_BLOCK_UNINIT set in bg_flags:
        No block in the group is used. So we can skip the block bitmap
        verification for this group.

We also add two new fields to group descriptor as a part of
uninitialized group patch.

        __le16  bg_itable_unused;       /* Unused inodes count */
        __le16  bg_checksum;            /* crc16(sb_uuid+group+desc) */

bg_itable_unused:

If we have EXT4_BG_INODE_UNINIT not set in bg_flags
then bg_itable_unused will give the offset within
the inode table till the inodes are used. This can be
used by fsck to skip list of inodes that are marked unused.

bg_checksum:
Now that we depend on bg_flags and bg_itable_unused to determine
the block and inode usage, we need to make sure group descriptor
is not corrupt. We add checksum to group descriptor to
detect corruption. If the descriptor is found to be corrupt, we
mark all the blocks and inodes in the group used.

Signed-off-by: Avantika Mathur <mathur@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2007-10-17 18:50:00 -04:00
..
9p 9p: fix bad kconfig cross-dependency 2007-10-17 14:31:07 -05:00
adfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
affs fs: mark nibblemap const 2007-10-17 08:42:47 -07:00
afs KEYS: Make request_key() and co fundamentally asynchronous 2007-10-17 08:42:57 -07:00
autofs Replace pid_t in autofs with struct pid reference 2007-05-11 08:29:36 -07:00
autofs4 fs/autofs4/inode.c: kmalloc + memset conversion to kzalloc 2007-10-17 08:42:50 -07:00
befs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
bfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
cifs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
coda Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
configfs r/o bind mounts: filesystem helpers for custom 'struct file's 2007-10-17 08:43:04 -07:00
cramfs cramfs: error message about endianess 2007-10-17 08:42:53 -07:00
debugfs docbook: fix filesystems content 2007-10-15 17:56:36 -07:00
devpts devpts: add fsnotify create event 2007-05-08 11:14:59 -07:00
dlm menuconfig: transform NLS and DLM menus 2007-10-17 08:43:00 -07:00
ecryptfs Clean up duplicate includes in fs/ecryptfs/ 2007-10-17 08:42:48 -07:00
efs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
exportfs knfsd: exportfs: split out reconnecting a dentry from find_exported_dentry 2007-07-17 10:23:06 -07:00
ext2 ext2 reservations 2007-10-17 08:43:02 -07:00
ext3 ext3: lighten up resize transaction requirements 2007-10-17 08:43:01 -07:00
ext4 Ext4: Uninitialized Block Groups 2007-10-17 18:50:00 -04:00
fat Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
freevxfs mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
fuse fuse: clean up execute permission checking 2007-10-17 08:43:04 -07:00
gfs2 fs: correct SuS compliance for open of large file without options 2007-10-17 08:43:01 -07:00
hfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
hfsplus Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
hostfs uml: fix hostfs style 2007-10-16 09:43:07 -07:00
hpfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
hppfs [PATCH] Mark struct super_operations const 2007-02-12 09:48:47 -08:00
hugetlbfs r/o bind mounts: filesystem helpers for custom 'struct file's 2007-10-17 08:43:04 -07:00
isofs fs/isofs/namei.c: Remove uninitialized local vars warning 2007-10-17 08:42:58 -07:00
jbd JBD: replace jbd_kmalloc with kmalloc directly 2007-10-17 18:49:57 -04:00
jbd2 JBD2: debug code cleanup. 2007-10-17 18:49:59 -04:00
jffs2 Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
jfs introduce I_SYNC 2007-10-17 08:43:02 -07:00
lockd NFS/SUNRPC: use transport protocol naming 2007-10-09 17:17:53 -04:00
minix limit minixfs printks on corrupted dir i_size 2007-10-17 08:42:53 -07:00
msdos [PATCH] mark struct inode_operations const 2 2007-02-12 09:48:46 -08:00
ncpfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
nfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
nfs_common [PATCH] nfs_common endianness annotations 2006-10-20 10:26:41 -07:00
nfsd Implement file posix capabilities 2007-10-17 08:43:07 -07:00
nls menuconfig: transform NLS and DLM menus 2007-10-17 08:43:00 -07:00
ntfs writeback: fix ntfs with sb_has_dirty_inodes() 2007-10-17 08:43:02 -07:00
ocfs2 Fix f_version type: should be u64 instead of unsigned long 2007-10-17 08:42:53 -07:00
openpromfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
partitions fs/partitions/sun.c endianness annotations 2007-10-14 12:41:51 -07:00
proc Don't truncate /proc/PID/environ at 4096 characters 2007-10-17 08:43:00 -07:00
qnx4 Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
ramfs Remove valueless definition of hard-selected RAMFS option 2007-10-17 08:42:56 -07:00
reiserfs reiserfs: do not repair wrong journal params 2007-10-17 08:43:01 -07:00
romfs fs/romfs/inode.c: trivial improvements 2007-10-17 08:42:47 -07:00
smbfs Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
sysfs spin_lock_unlocked cleanups 2007-10-17 08:43:01 -07:00
sysv Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
udf fs/udf/balloc.c: mark a variable as uninitialized_var() 2007-10-17 08:43:00 -07:00
ufs ufs: Fix mount check in ufs_fill_super() 2007-10-17 08:42:51 -07:00
vfat [PATCH] mark struct inode_operations const 3 2007-02-12 09:48:46 -08:00
xfs Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 2007-10-17 09:04:11 -07:00
Kconfig Ext4: Uninitialized Block Groups 2007-10-17 18:50:00 -04:00
Kconfig.binfmt fs: Kill sh dependency for binfmt_flat. 2007-05-21 14:34:00 +09:00
Makefile Remove valueless definition of hard-selected RAMFS option 2007-10-17 08:42:56 -07:00
aio.c aio: account I/O wait time properly 2007-10-17 08:42:53 -07:00
anon_inodes.c anon-inodes use open coded atomic_inc for the shared inode 2007-10-17 08:43:00 -07:00
attr.c Implement file posix capabilities 2007-10-17 08:43:07 -07:00
bad_inode.c sendfile: remove bad_sendfile() from bad_file_ops 2007-07-10 08:04:15 +02:00
binfmt_aout.c core_pattern: ignore RLIMIT_CORE if core_pattern is a pipe 2007-10-17 08:42:50 -07:00
binfmt_elf.c Break ELF_PLATFORM and stack pointer randomization dependency 2007-10-17 08:43:01 -07:00
binfmt_elf_fdpic.c core_pattern: ignore RLIMIT_CORE if core_pattern is a pipe 2007-10-17 08:42:50 -07:00
binfmt_em86.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
binfmt_flat.c binfmt_flat: warning fixes 2007-10-17 08:42:54 -07:00
binfmt_misc.c mm: variable length argument support 2007-07-19 10:04:45 -07:00
binfmt_script.c mm: variable length argument support 2007-07-19 10:04:45 -07:00
binfmt_som.c core_pattern: ignore RLIMIT_CORE if core_pattern is a pipe 2007-10-17 08:42:50 -07:00
bio.c bio: make freeing of ->bi_io_vec conditional in bio_free() 2007-10-16 11:03:52 +02:00
block_dev.c Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
buffer.c writeback: remove pages_skipped accounting in __block_write_full_page() 2007-10-17 08:43:02 -07:00
char_dev.c mm: bdi init hooks 2007-10-17 08:42:45 -07:00
compat.c mm: variable length argument support 2007-07-19 10:04:45 -07:00
compat_ioctl.c Clean up duplicate includes in fs/ 2007-10-17 08:42:48 -07:00
dcache.c vfs: use the predefined d_unhashed inline function instead 2007-10-17 08:43:00 -07:00
dcookies.c Remove fs.h from mm.h 2007-07-29 17:09:29 -07:00
direct-io.c remove ZERO_PAGE 2007-10-16 09:42:53 -07:00
dnotify.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
dquot.c quota: send messages via netlink 2007-10-17 08:42:56 -07:00
drop_caches.c invalidate_mapping_pages(): add cond_resched 2007-07-16 09:05:36 -07:00
eventfd.c eventfd use waitqueue lock ... 2007-05-18 13:09:34 -07:00
eventpoll.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
exec.c security/ cleanups 2007-10-17 08:43:07 -07:00
fcntl.c F_DUPFD_CLOEXEC implementation 2007-10-17 08:43:01 -07:00
fifo.c Detach sched.h from mm.h 2007-05-21 09:18:19 -07:00
file.c [PATCH] fdtable: Provide free_fdtable() wrapper 2006-12-22 08:55:50 -08:00
file_table.c r/o bind mounts: filesystem helpers for custom 'struct file's 2007-10-17 08:43:04 -07:00
filesystems.c add filesystem subtype support 2007-05-08 11:15:01 -07:00
fs-writeback.c introduce I_SYNC 2007-10-17 08:43:02 -07:00
generic_acl.c Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check 2007-07-17 12:00:03 -07:00
inode.c introduce I_SYNC 2007-10-17 08:43:02 -07:00
inotify.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
inotify_user.c change inotifyfs magic as the same magic is used for futexfs 2007-10-17 08:43:00 -07:00
internal.h cleanup compat ioctl handling 2007-05-08 11:15:09 -07:00
ioctl.c drop obsolete sys_ioctl export 2007-07-16 09:05:48 -07:00
ioprio.c [PATCH] pid: replace do/while_each_task_pid with do/while_each_pid_task 2007-02-12 09:48:32 -08:00
libfs.c make fs/libfs.c:simple_commit_write() static 2007-10-17 08:42:53 -07:00
locks.c Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
mbcache.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
mpage.c mm: buffered write cleanup 2007-10-16 09:42:54 -07:00
namei.c r/o bind mounts: give permission() a local 'mnt' variable 2007-10-17 08:43:05 -07:00
namespace.c fs: remove the unused mempages parameter 2007-10-17 08:42:49 -07:00
nfsctl.c nfsctl: use vfs_path_lookup 2007-07-19 10:04:45 -07:00
no-block.c
open.c Implement file posix capabilities 2007-10-17 08:43:07 -07:00
pipe.c sched: affine sync wakeups 2007-10-15 17:00:19 +02:00
pnode.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
pnode.h
posix_acl.c
quota.c [IA64] Fix build failure in fs/quota.c 2007-07-27 15:40:13 -07:00
quota_v1.c
quota_v2.c
read_write.c Cleanup macros for distinguishing mandatory locks 2007-10-09 18:32:46 -04:00
read_write.h
readdir.c ROUND_UP macro cleanup in fs/(select|compat|readdir).c 2007-05-08 11:15:09 -07:00
select.c Use ERESTART_RESTARTBLOCK if poll() is interrupted by a signal 2007-10-17 08:42:53 -07:00
seq_file.c [FS] seq_file: Introduce the seq_open_private() 2007-10-10 16:55:33 -07:00
signalfd.c rename signalfd_siginfo fields 2007-10-17 08:43:01 -07:00
splice.c Implement file posix capabilities 2007-10-17 08:43:07 -07:00
stack.c [PATCH] fs/stack.c: Copy i_nlink after all other attributes are copied 2007-02-19 14:21:50 -08:00
stat.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
super.c writeback: fix periodic superblock dirty inode flushing 2007-10-17 08:43:02 -07:00
sync.c Introduce fixed sys_sync_file_range2() syscall, implement on PowerPC and ARM 2007-06-28 11:38:30 -07:00
timerfd.c make timerfd return a u64 and fix the __put_user 2007-07-26 11:35:17 -07:00
utimes.c VFS: check nanoseconds in utimensat 2007-10-17 08:42:52 -07:00
xattr.c Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check 2007-07-17 12:00:03 -07:00
xattr_acl.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00