linux/fs
Mel Gorman a1e78772d7 hugetlb: reserve huge pages for reliable MAP_PRIVATE hugetlbfs mappings until fork()
This patch reserves huge pages at mmap() time for MAP_PRIVATE mappings in
a similar manner to the reservations taken for MAP_SHARED mappings.  The
reserve count is accounted both globally and on a per-VMA basis for
private mappings.  This guarantees that a process that successfully calls
mmap() will successfully fault all pages in the future unless fork() is
called.

The characteristics of private mappings of hugetlbfs files behaviour after
this patch are;

1. The process calling mmap() is guaranteed to succeed all future faults until
   it forks().
2. On fork(), the parent may die due to SIGKILL on writes to the private
   mapping if enough pages are not available for the COW. For reasonably
   reliable behaviour in the face of a small huge page pool, children of
   hugepage-aware processes should not reference the mappings; such as
   might occur when fork()ing to exec().
3. On fork(), the child VMAs inherit no reserves. Reads on pages already
   faulted by the parent will succeed. Successful writes will depend on enough
   huge pages being free in the pool.
4. Quotas of the hugetlbfs mount are checked at reserve time for the mapper
   and at fault time otherwise.

Before this patch, all reads or writes in the child potentially needs page
allocations that can later lead to the death of the parent.  This applies
to reads and writes of uninstantiated pages as well as COW.  After the
patch it is only a write to an instantiated page that causes problems.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:16 -07:00
..
9p 9p: fix O_APPEND in legacy mode 2008-07-03 09:59:03 -05:00
adfs fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
affs [PATCH] fix reservation discarding in affs 2008-05-06 13:45:33 -04:00
afs Fix various old email addresses for dwmw2 2008-06-06 11:29:10 -07:00
autofs mount options: fix autofs 2008-02-08 09:22:40 -08:00
autofs4 autofs: path_{get,put}() cleanups 2008-05-01 08:04:01 -07:00
befs byteorder: don't directly include linux/byteorder/generic.h 2008-05-16 12:01:45 -07:00
bfs fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
cifs Merge commit 'v2.6.26' into bkl-removal 2008-07-14 15:29:34 -06:00
coda device create: coda: convert device_create to device_create_drvdata 2008-07-21 21:54:41 -07:00
configfs configfs: Allow ->make_item() and ->make_group() to return detailed errors. 2008-07-17 15:21:29 -07:00
cramfs fs: Remove unnecessary inclusions of asm/semaphore.h 2008-04-18 22:16:44 -04:00
debugfs debugfs: Implement debugfs_remove_recursive() 2008-07-21 21:54:59 -07:00
devpts devpts: factor out PTY index allocation 2008-04-30 08:29:48 -07:00
dlm configfs: Allow ->make_item() and ->make_group() to return detailed errors. 2008-07-17 15:21:29 -07:00
ecryptfs Merge commit 'v2.6.26' into bkl-removal 2008-07-14 15:29:34 -06:00
efs efs: update error msg to not refer to deleted read_inode() 2008-04-02 15:28:19 -07:00
exportfs fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
ext2 ext2: retry block allocation if new blocks are allocated from system zone 2008-04-28 08:58:43 -07:00
ext3 ext3: add missing unlock to error path in ext3_quota_write() 2008-07-04 10:40:05 -07:00
ext4 ext4: do not set extents feature from the kernel 2008-07-11 19:27:31 -04:00
fat Merge commit 'v2.6.26' into bkl-removal 2008-07-14 15:29:34 -06:00
freevxfs fs/freevxfs/: proper externs 2008-04-29 08:06:00 -07:00
fuse fuse: fix thinko in max I/O size calucation 2008-06-17 18:08:10 -07:00
gfs2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw 2008-07-15 10:38:46 -07:00
hfs hfs: fix warning with 64k PAGE_SIZE 2008-04-30 08:29:52 -07:00
hfsplus Fix hfsplus oops on image without extents 2008-05-13 08:02:24 -07:00
hostfs uml: fix hostfs tv_usec calculations 2008-02-05 09:44:30 -08:00
hpfs mount options: fix hpfs 2008-02-08 09:22:40 -08:00
hppfs fix hppfs Makefile breakage 2008-05-21 16:55:58 -07:00
hugetlbfs hugetlb: reserve huge pages for reliable MAP_PRIVATE hugetlbfs mappings until fork() 2008-07-24 10:47:16 -07:00
isofs isofs: fix access to unallocated memory when reading corrupted filesystem 2008-04-30 08:29:33 -07:00
jbd jbd: need to hold j_state_lock to updates to transaction t_state to T_COMMIT 2008-05-14 19:11:14 -07:00
jbd2 ext4: Add ordered mode support for delalloc 2008-07-11 19:27:31 -04:00
jffs2 Merge git://git.infradead.org/mtd-2.6 2008-05-01 11:15:28 -07:00
jfs jfs: remove DIRENTSIZ 2008-06-10 15:12:58 -05:00
lockd Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux 2008-07-20 21:21:46 -07:00
minix
msdos Replace BKL with superblock lock in fat/msdos/vfat 2008-06-20 14:05:54 -06:00
ncpfs Remove BKL from remote_llseek v2 2008-07-02 15:06:27 -06:00
nfs Merge branch 'bkl-removal' into next 2008-07-15 18:34:58 -04:00
nfs_common
nfsd Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux 2008-07-20 21:21:46 -07:00
nls
ntfs ntfs: le*_add_cpu conversion 2008-05-24 09:56:08 -07:00
ocfs2 configfs: Allow ->make_item() and ->make_group() to return detailed errors. 2008-07-17 15:21:29 -07:00
openpromfs
partitions driver core: remove KOBJ_NAME_LEN define 2008-07-21 21:54:52 -07:00
proc mm/vmstat.c: proper externs 2008-07-24 10:47:14 -07:00
qnx4
ramfs ramfs: enable splice write 2008-07-04 09:52:14 +02:00
reiserfs reiserfs: discard prealloc in reiserfs_delete_inode 2008-07-08 12:39:31 -07:00
romfs ROMFS: Fix up an error in iget removal 2008-03-19 18:53:36 -07:00
smbfs Remove BKL from remote_llseek v2 2008-07-02 15:06:27 -06:00
sysfs driver core: Suppress sysfs warnings for device_rename(). 2008-07-21 21:55:01 -07:00
sysv sysv: [bl]e*_add_cpu conversion 2008-04-30 08:29:52 -07:00
ubifs UBIFS: include to compilation 2008-07-15 17:35:24 +03:00
udf udf: Fix regression in UDF anchor block detection 2008-06-24 11:38:03 +02:00
ufs ufs: remove unneeded ufs_put_inode prototype 2008-05-13 08:02:23 -07:00
vfat Replace BKL with superblock lock in fat/msdos/vfat 2008-06-20 14:05:54 -06:00
xfs Fix reference counting race on log buffers 2008-07-11 11:37:18 -07:00
Kconfig Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 2008-07-17 10:55:51 -07:00
Kconfig.binfmt frv: don't offer BINFMT_FLAT 2008-06-06 11:29:08 -07:00
Makefile Merge branch 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6 2008-07-16 15:02:57 -07:00
aio.c uml: activate_mm: remove the dead PF_BORROWED_MM check 2008-06-06 11:36:22 -07:00
anon_inodes.c [PATCH] sanitize anon_inode_getfd() 2008-05-01 13:08:50 -04:00
attr.c
bad_inode.c
binfmt_aout.c fs/binfmt_aout.c: use printk_ratelimit() 2008-04-29 08:06:04 -07:00
binfmt_elf.c execve filename: document and export via auxiliary vector 2008-07-22 09:59:40 -07:00
binfmt_elf_fdpic.c nommu: fix ksize() abuse 2008-06-06 11:29:13 -07:00
binfmt_em86.c binfmt_misc.c: avoid potential kernel stack overflow 2008-04-29 08:06:04 -07:00
binfmt_flat.c nommu: fix ksize() abuse 2008-06-06 11:29:13 -07:00
binfmt_misc.c binfmt_misc.c: avoid potential kernel stack overflow 2008-04-29 08:06:04 -07:00
binfmt_script.c binfmt_misc.c: avoid potential kernel stack overflow 2008-04-29 08:06:04 -07:00
binfmt_som.c [PATCH] sanitize handling of shared descriptor tables in failing execve() 2008-04-25 09:23:53 -04:00
bio-integrity.c block: integrity checkpatch cleanups 2008-07-03 13:21:13 +02:00
bio.c Add bvec_merge_data to handle stacked devices and ->merge_bvec() 2008-07-03 13:21:15 +02:00
block_dev.c [PATCH] fix cgroup-inflicted breakage in block_dev.c 2008-06-23 08:30:55 -04:00
buffer.c Merge branch 'generic-ipi' into generic-ipi-for-linus 2008-07-15 21:55:59 +02:00
char_dev.c Remove the lock_kernel() call from chrdev_open() 2008-06-20 14:05:53 -06:00
compat.c [PATCH] get rid of leak in compat_execve() 2008-05-16 17:23:05 -04:00
compat_binfmt_elf.c
compat_ioctl.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2008-07-19 00:30:39 -07:00
dcache.c fix soft lock up at NFS mount via per-SB LRU-list of unused dentries 2008-07-24 10:47:15 -07:00
dcookies.c d_path: Make d_path() use a struct path 2008-02-14 21:17:09 -08:00
direct-io.c Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user 2008-02-05 09:44:13 -08:00
dnotify.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
dquot.c quota: don't call sync_fs() from vfs_quota_off() when there's no quota turn off 2008-05-13 08:02:23 -07:00
drop_caches.c vfs: skip inodes without pages to free in drop_pagecache_sb() 2008-04-29 08:06:05 -07:00
eventfd.c [PATCH] sanitize anon_inode_getfd() 2008-05-01 13:08:50 -04:00
eventpoll.c [PATCH] sanitize anon_inode_getfd() 2008-05-01 13:08:50 -04:00
exec.c mm: remove double indirection on tlb parameter to free_pgd_range() & Co 2008-07-24 10:47:15 -07:00
fcntl.c Call fasync() functions without the BKL 2008-07-02 15:06:28 -06:00
fifo.c
file.c [PATCH] avoid multiplication overflows and signedness issues for max_fds 2008-05-16 17:22:52 -04:00
file_table.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
filesystems.c
fs-writeback.c VFS: export sync_sb_inodes 2008-07-14 19:10:52 +03:00
generic_acl.c
inode.c VFS: fix unused variable warning 2008-05-06 13:13:37 -07:00
inotify.c
inotify_user.c Remove duplicated unlikely() in IS_ERR() 2008-04-29 08:06:25 -07:00
internal.h [PATCH] move a bunch of declarations to fs/internal.h 2008-04-21 23:11:01 -04:00
ioctl.c make vfs_ioctl() static 2008-04-29 08:06:00 -07:00
ioprio.c
libfs.c add kernel-doc for simple_read_from_buffer and memory_read_from_buffer 2008-07-04 10:40:07 -07:00
locks.c [patch 4/4] flock: remove unused fields from file_lock_operations 2008-06-23 11:52:30 -04:00
mbcache.c vfs: fix possible deadlock in ext2, ext3, ext4 when using xattrs 2008-04-15 19:35:41 -07:00
mpage.c vfs: add hooks for ext4's delayed allocation support 2008-07-11 19:27:31 -04:00
namei.c [patch 3/4] vfs: fix ERR_PTR abuse in generic_readlink 2008-06-23 11:52:30 -04:00
namespace.c LSM/SELinux: show LSM mount options in /proc/mounts 2008-07-14 15:02:05 +10:00
nfsctl.c Introduce path_put() 2008-02-14 21:13:33 -08:00
no-block.c
open.c security: filesystem capabilities: fix fragile setuid fixup code 2008-07-04 10:40:08 -07:00
pipe.c [patch 1/4] vfs: path_{get,put}() cleanups 2008-06-23 11:52:29 -04:00
pnode.c [patch 7/7] vfs: mountinfo: show dominating group id 2008-04-23 00:05:09 -04:00
pnode.h [patch 7/7] vfs: mountinfo: show dominating group id 2008-04-23 00:05:09 -04:00
posix_acl.c
quota.c quota: quota core changes for quotaon on remount 2008-04-28 08:58:33 -07:00
quota_v1.c quota: do not allow setting of quota limits to too high values 2008-04-28 08:58:32 -07:00
quota_v2.c quota: le*_add_cpu conversion 2008-04-30 08:29:51 -07:00
read_write.c Remove BKL from remote_llseek v2 2008-07-02 15:06:27 -06:00
read_write.h
readdir.c
select.c Fix performance regression on lmbench select benchmark 2008-06-22 12:23:15 -07:00
seq_file.c [patch 2/7] vfs: mountinfo: add seq_file_root() 2008-04-23 00:04:38 -04:00
signalfd.c [PATCH] sanitize anon_inode_getfd() 2008-05-01 13:08:50 -04:00
splice.c splice: fix generic_file_splice_read() race with page invalidation 2008-07-04 09:52:14 +02:00
stack.c
stat.c Introduce path_put() 2008-02-14 21:13:33 -08:00
super.c fix soft lock up at NFS mount via per-SB LRU-list of unused dentries 2008-07-24 10:47:15 -07:00
sync.c vfs: fix unconditional write_super() call in file_fsync() 2008-04-29 08:06:06 -07:00
timerfd.c [PATCH] sanitize anon_inode_getfd() 2008-05-01 13:08:50 -04:00
utimes.c [patch for 2.6.26 4/4] vfs: utimensat(): fix write access check for futimens() 2008-06-23 08:43:52 -04:00
xattr.c xattr: add missing consts to function arguments 2008-04-29 08:06:06 -07:00
xattr_acl.c