linux/kernel
Balbir Singh 31a78f23ba mm owner: fix race between swapoff and exit
There's a race between mm->owner assignment and swapoff, more easily
seen when task slab poisoning is turned on.  The condition occurs when
try_to_unuse() runs in parallel with an exiting task.  A similar race
can occur with callers of get_task_mm(), such as /proc/<pid>/<mmstats>
or ptrace or page migration.

CPU0                                    CPU1
                                        try_to_unuse
                                        looks at mm = task0->mm
                                        increments mm->mm_users
task 0 exits
mm->owner needs to be updated, but no
new owner is found (mm_users > 1, but
no other task has task->mm = task0->mm)
mm_update_next_owner() leaves
                                        mmput(mm) decrements mm->mm_users
task0 freed
                                        dereferencing mm->owner fails

The fix is to notify the subsystem via mm_owner_changed callback(),
if no new owner is found, by specifying the new task as NULL.

Jiri Slaby:
mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but
must be set after that, so as not to pass NULL as old owner causing oops.

Daisuke Nishimura:
mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task()
and its callers need to take account of this situation to avoid oops.

Hugh Dickins:
Lockdep warning and hang below exec_mmap() when testing these patches.
exit_mm() up_reads mmap_sem before calling mm_update_next_owner(),
so exec_mmap() now needs to do the same.  And with that repositioning,
there's now no point in mm_need_new_owner() allowing for NULL mm.

Reported-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-29 08:41:47 -07:00
..
irq
power
time timers: fix build error in !oneshot case 2008-09-23 12:57:00 +02:00
trace
.gitignore
acct.c
audit.c
audit.h
audit_tree.c
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c
cgroup.c mm owner: fix race between swapoff and exit 2008-09-29 08:41:47 -07:00
cgroup_debug.c
compat.c
configs.c
cpu.c
cpuset.c cpuset: avoid changing cpuset's cpus when -errno returned 2008-09-13 14:41:50 -07:00
delayacct.c
dma-coherent.c
dma.c
exec_domain.c
exit.c mm owner: fix race between swapoff and exit 2008-09-29 08:41:47 -07:00
extable.c
fork.c
futex.c
futex_compat.c
hrtimer.c
itimer.c
kallsyms.c
Kconfig.hz
Kconfig.preempt
kexec.c kexec: fix segmentation fault in kimage_add_entry 2008-09-23 08:09:14 -07:00
kfifo.c
kgdb.c kgdb, x86, arm, mips, powerpc: ignore user space single stepping 2008-09-26 10:36:41 -05:00
kmod.c
kprobes.c
ksysfs.c
kthread.c
latencytop.c
lockdep.c
lockdep_internals.h
lockdep_proc.c
Makefile
marker.c
module.c
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
ns_cgroup.c
nsproxy.c
panic.c
params.c
pid.c
pid_namespace.c
pm_qos_params.c
posix-cpu-timers.c
posix-timers.c
printk.c
profile.c
ptrace.c
rcuclassic.c
rcupdate.c
rcupreempt.c
rcupreempt_trace.c
rcutorture.c
relay.c
res_counter.c
resource.c
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rtmutex_common.h
rwsem.c
sched.c sched: fix init_hrtick() section mismatch warning 2008-09-23 11:02:13 +02:00
sched_clock.c
sched_cpupri.c
sched_cpupri.h
sched_debug.c
sched_fair.c
sched_features.h
sched_idletask.c
sched_rt.c sched: fix 2.6.27-rc5 couldn't boot on tulsa machine randomly 2008-09-11 09:34:28 +02:00
sched_stats.h
seccomp.c
semaphore.c
signal.c
smp.c
softirq.c
softlockup.c
spinlock.c
srcu.c
stacktrace.c
stop_machine.c
sys.c
sys_ni.c
sysctl.c
sysctl_check.c
taskstats.c
test_kprobes.c
time.c
timeconst.pl
timer.c
tsacct.c
uid16.c
user.c
user_namespace.c
utsname.c
utsname_sysctl.c
wait.c
workqueue.c