linux/kernel
Mel Gorman a1e78772d7 hugetlb: reserve huge pages for reliable MAP_PRIVATE hugetlbfs mappings until fork()
This patch reserves huge pages at mmap() time for MAP_PRIVATE mappings in
a similar manner to the reservations taken for MAP_SHARED mappings.  The
reserve count is accounted both globally and on a per-VMA basis for
private mappings.  This guarantees that a process that successfully calls
mmap() will successfully fault all pages in the future unless fork() is
called.

The characteristics of private mappings of hugetlbfs files behaviour after
this patch are;

1. The process calling mmap() is guaranteed to succeed all future faults until
   it forks().
2. On fork(), the parent may die due to SIGKILL on writes to the private
   mapping if enough pages are not available for the COW. For reasonably
   reliable behaviour in the face of a small huge page pool, children of
   hugepage-aware processes should not reference the mappings; such as
   might occur when fork()ing to exec().
3. On fork(), the child VMAs inherit no reserves. Reads on pages already
   faulted by the parent will succeed. Successful writes will depend on enough
   huge pages being free in the pool.
4. Quotas of the hugetlbfs mount are checked at reserve time for the mapper
   and at fault time otherwise.

Before this patch, all reads or writes in the child potentially needs page
allocations that can later lead to the death of the parent.  This applies
to reads and writes of uninstantiated pages as well as COW.  After the
patch it is only a write to an instantiated page that causes problems.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:16 -07:00
..
irq set_irq_wake: fix return code and wake status tracking 2008-07-23 09:35:53 -07:00
power Merge branch 'linus' into xen-64bit 2008-07-17 23:57:20 +02:00
time Merge branch 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-23 18:37:44 -07:00
trace cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr 2008-07-18 22:02:57 +02:00
.gitignore Update kernel/.gitignore with new auto-generated files 2008-02-09 23:27:01 -08:00
Kconfig.hz sched, x86: clean up hrtick implementation 2008-07-20 10:37:28 +02:00
Kconfig.preempt rcu: move PREEMPT_RCU config option back under PREEMPT 2008-03-10 18:01:20 -07:00
Makefile Revert parts of "ftrace: do not trace scheduler functions" 2008-07-18 08:59:24 +02:00
acct.c bsd_acct: using task_struct->tgid is not right in pid-namespaces 2008-03-24 19:22:20 -07:00
audit.c [PATCH] remove useless argument type in audit_filter_user() 2008-06-24 23:36:35 -04:00
audit.h [PATCH 1/2] audit: move extern declarations to audit.h 2008-04-28 06:28:04 -04:00
audit_tree.c [PATCH] list_for_each_rcu must die: audit 2008-05-17 03:30:23 -04:00
auditfilter.c [PATCH] remove useless argument type in audit_filter_user() 2008-06-24 23:36:35 -04:00
auditsc.c x86_64 syscall audit fast-path 2008-07-23 17:47:32 -07:00
backtracetest.c backtrace: replace timer with tasklet + completions 2008-06-27 18:09:16 +02:00
bounds.c Add kbuild.h that contains common definitions for kbuild users 2008-04-29 08:06:29 -07:00
capability.c security: filesystem capabilities: fix fragile setuid fixup code 2008-07-04 10:40:08 -07:00
cgroup.c cgroups: remove node_ prefix_from ns subsystem 2008-05-24 09:56:14 -07:00
cgroup_debug.c CGroup API files: move "releasable" to cgroup_debug subsystem 2008-04-29 08:06:09 -07:00
compat.c ntp: support for TAI 2008-05-01 08:03:59 -07:00
configs.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
cpu.c Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-23 19:36:53 -07:00
cpuset.c Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-23 19:36:53 -07:00
delayacct.c
dma.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
exec_domain.c remove CONFIG_KMOD from core kernel code 2008-07-22 19:24:31 +10:00
exit.c fix dangling zombie when new parent ignores children 2008-07-16 18:02:34 -07:00
extable.c module: Don't report discarded init pages as kernel text. 2008-01-29 17:13:18 +11:00
fork.c hugetlb: reserve huge pages for reliable MAP_PRIVATE hugetlbfs mappings until fork() 2008-07-24 10:47:16 -07:00
futex.c futexes: fix fault handling in futex_lock_pi 2008-06-23 13:31:15 +02:00
futex_compat.c futex_compat __user annotation 2008-03-30 14:18:41 -07:00
hrtimer.c Merge branch 'generic-ipi' into generic-ipi-for-linus 2008-07-15 21:55:59 +02:00
itimer.c ITIMER_REAL: convert to use struct pid 2008-02-08 09:22:29 -08:00
kallsyms.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
kexec.c kexec: make extended crashkernel= syntax less confusing 2008-05-01 08:04:00 -07:00
kfifo.c
kgdb.c kgdb: sparse fix 2008-06-24 10:52:55 -05:00
kmod.c remove CONFIG_KMOD from core kernel code 2008-07-22 19:24:31 +10:00
kprobes.c kernel/kprobes.c: Made kprobe_blacklist static. 2008-07-10 10:13:51 -07:00
ksysfs.c Kobject: convert remaining kobject_unregister() to kobject_put() 2008-01-24 20:40:40 -08:00
kthread.c Freezer: Introduce PF_FREEZER_NOSIG 2008-07-16 23:27:03 +02:00
latencytop.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
lockdep.c Merge branch 'core/locking' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-14 14:55:13 -07:00
lockdep_internals.h lockdep: add lock_class information to lock_chain and output it 2008-06-24 01:28:20 +02:00
lockdep_proc.c lockdep: add lock_class information to lock_chain and output it 2008-06-24 01:28:20 +02:00
marker.c Markers - remove extra format argument 2008-05-23 22:25:27 +02:00
module.c modules: Take a shortcut for checking if an address is in a module 2008-07-22 19:24:28 +10:00
mutex-debug.c mutex-debug: check mutex magic before owner 2008-05-16 16:53:35 +02:00
mutex-debug.h
mutex.c __mutex_lock_common: use signal_pending_state() 2008-06-10 11:45:09 +02:00
mutex.h
notifier.c ipc: re-enable msgmni automatic recomputing msgmni if set to negative 2008-04-29 08:06:13 -07:00
ns_cgroup.c cgroups: kernel/ns_cgroup.c should #include <linux/nsproxy.h> 2008-04-29 08:06:07 -07:00
nsproxy.c ipc: sysvsem: refuse clone(CLONE_SYSVSEM|CLONE_NEWIPC) 2008-04-29 08:06:14 -07:00
panic.c Taint kernel after WARN_ON(condition) 2008-04-29 08:05:59 -07:00
params.c Add new string functions strict_strto* and convert kernel params to use them 2008-02-08 09:22:41 -08:00
pid.c rcu: split list.h and move rcu-protected lists into rculist.h 2008-05-19 10:01:37 +02:00
pid_namespace.c pidns: make pid->level and pid_ns->level unsigned 2008-04-30 08:29:49 -07:00
pm_qos_params.c pm_qos_params: BKL pushdown 2008-07-02 15:06:24 -06:00
posix-cpu-timers.c posix-timers: print RT watchdog message 2008-05-24 18:49:22 +02:00
posix-timers.c signals: join send_sigqueue() with send_group_sigqueue() 2008-04-30 08:29:36 -07:00
printk.c Merge branch 'core/printk' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-14 15:27:43 -07:00
profile.c on_each_cpu(): kill unused 'retry' parameter 2008-06-26 11:24:38 +02:00
ptrace.c ptrace children revamp 2008-07-16 18:02:33 -07:00
rcuclassic.c Merge branch 'linus' into cpus4096 2008-07-16 00:29:07 +02:00
rcupdate.c Merge branch 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-15 14:12:03 -07:00
rcupreempt.c Merge branch 'linus' into cpus4096 2008-07-16 00:29:07 +02:00
rcupreempt_trace.c rcu: remove duplicated include in kernel/rcupreempt_trace.c 2008-05-19 10:03:39 +02:00
rcutorture.c rcu: make rcutorture even more vicious: invoke RCU readers from irq handlers (timers) 2008-06-26 09:24:33 +02:00
relay.c splice: fix sendfile() issue with relay 2008-05-28 14:49:27 +02:00
res_counter.c memcgroup: add the max_usage member on the res_counter 2008-04-29 08:06:10 -07:00
resource.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
rtmutex-debug.c Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rtmutex-debug.h
rtmutex-tester.c sysdev: Pass the attribute to the low level sysdev show/store function 2008-07-21 21:55:02 -07:00
rtmutex.c hrtimer: more hrtimer_init_sleeper() fallout. 2008-02-13 15:45:36 +01:00
rtmutex.h
rtmutex_common.h Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rwsem.c
sched.c Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-23 19:36:53 -07:00
sched_clock.c Merge branch 'sched/clock' into sched/devel 2008-07-14 12:19:13 +02:00
sched_cpupri.c sched: use a 2-d bitmap for searching lowest-pri CPU 2008-06-06 15:19:28 +02:00
sched_cpupri.h sched: fix the cpuprio count really 2008-06-06 15:19:44 +02:00
sched_debug.c sched: add full schedstats to /proc/sched_debug 2008-06-27 14:31:31 +02:00
sched_fair.c Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-23 19:36:53 -07:00
sched_features.h sched: bias effective_load() error towards failing wake_affine(). 2008-06-27 14:31:47 +02:00
sched_idletask.c sched: make rt_sched_class, idle_sched_class static 2008-05-05 23:56:17 +02:00
sched_rt.c Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-23 19:36:53 -07:00
sched_stats.h sched: fix accounting in task delay accounting & migration 2008-07-04 12:50:23 +02:00
seccomp.c
semaphore.c mmiotrace broken in linux-next (8-bit writes only) 2008-07-01 10:14:06 +02:00
signal.c posix timers: discard SI_TIMER signals on exec 2008-05-26 10:37:07 -07:00
smp.c generic ipi function calls: wait on alloc failure fallback 2008-07-15 14:12:20 -07:00
softirq.c Merge branch 'generic-ipi' into generic-ipi-for-linus 2008-07-15 21:55:59 +02:00
softlockup.c softlockup: print a module list on being stuck 2008-07-05 08:51:24 +02:00
spinlock.c ftrace: lockdep notrace annotations 2008-05-23 20:39:40 +02:00
srcu.c make srcu_readers_active() static 2008-02-06 10:41:02 -08:00
stacktrace.c stacktrace: fix modular build, export print_stack_trace and save_stack_trace 2008-06-30 09:20:55 +02:00
stop_machine.c cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr 2008-07-18 22:02:57 +02:00
sys.c sys_prctl(): fix return of uninitialized value 2008-05-24 09:56:13 -07:00
sys_ni.c Fix build on COMPAT platforms when CONFIG_EPOLL is disabled 2008-07-22 09:59:41 -07:00
sysctl.c mm/vmstat.c: proper externs 2008-07-24 10:47:14 -07:00
sysctl_check.c constify tables in kernel/sysctl_check.c 2008-02-08 09:22:31 -08:00
taskstats.c core: use performance variant for_each_cpu_mask_nr 2008-05-23 18:35:12 +02:00
test_kprobes.c kprobes: kretprobe user entry-handler 2008-02-06 10:41:11 -08:00
time.c Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timeconst.pl Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timer.c Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm 2008-07-14 16:06:58 -07:00
tsacct.c
uid16.c asmlinkage_protect replaces prevent_tail_call 2008-04-10 17:28:26 -07:00
user.c alloc_uid: cleanup 2008-04-30 08:29:53 -07:00
user_namespace.c eCryptfs: make key module subsystem respect namespaces 2008-04-29 08:06:07 -07:00
utsname.c kernel: explicitly include required header files under kernel/ 2008-04-29 08:06:04 -07:00
utsname_sysctl.c
wait.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
workqueue.c Merge commit 'v2.6.26-rc9' into cpus4096 2008-07-06 14:23:39 +02:00