linux

q3k/linux

History

Michal Hocko 9f50fad65b Revert "memcg: get rid of percpu_charge_mutex lock" This reverts commit `8521fc50d4`. The patch incorrectly assumes that using atomic FLUSHING_CACHED_CHARGE bit operations is sufficient but that is not true. Johannes Weiner has reported a crash during parallel memory cgroup removal: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [<ffffffff81083b70>] css_is_ancestor+0x20/0x70 Oops: 0000 [#1] PREEMPT SMP Pid: 19677, comm: rmdir Tainted: G W 3.0.0-mm1-00188-gf38d32b #35 ECS MCP61M-M3/MCP61M-M3 RIP: 0010:[<ffffffff81083b70>] css_is_ancestor+0x20/0x70 RSP: 0018:ffff880077b09c88 EFLAGS: 00010202 Process rmdir (pid: 19677, threadinfo ffff880077b08000, task ffff8800781bb310) Call Trace: [<ffffffff810feba3>] mem_cgroup_same_or_subtree+0x33/0x40 [<ffffffff810feccf>] drain_all_stock+0x11f/0x170 [<ffffffff81103211>] mem_cgroup_force_empty+0x231/0x6d0 [<ffffffff811036c4>] mem_cgroup_pre_destroy+0x14/0x20 [<ffffffff81080559>] cgroup_rmdir+0xb9/0x500 [<ffffffff81114d26>] vfs_rmdir+0x86/0xe0 [<ffffffff81114e7b>] do_rmdir+0xfb/0x110 [<ffffffff81114ea6>] sys_rmdir+0x16/0x20 [<ffffffff8154d76b>] system_call_fastpath+0x16/0x1b We are crashing because we try to dereference cached memcg when we are checking whether we should wait for draining on the cache. The cache is already cleaned up, though. There is also a theoretical chance that the cached memcg gets freed between we test for the FLUSHING_CACHED_CHARGE and dereference it in mem_cgroup_same_or_subtree: CPU0 CPU1 CPU2 mem=stock->cached stock->cached=NULL clear_bit test_and_set_bit test_bit() ... <preempted> mem_cgroup_destroy use after free The percpu_charge_mutex protected from this race because sync draining is exclusive. It is safer to revert now and come up with a more parallel implementation later. Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Johannes Weiner <jweiner@redhat.com> Acked-by: Johannes Weiner <jweiner@redhat.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2011-08-09 17:04:43 -07:00
..
backing-dev.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback	2011-07-26 10:39:54 -07:00
bootmem.c	crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn	2011-03-23 19:47:19 -07:00
bounce.c	bounce: call flush_dcache_page() after bounce_copy_vec()	2010-09-09 18:57:25 -07:00
cleancache.c	mm: cleancache core ops functions and config	2011-05-26 10:01:36 -06:00
compaction.c	mm: compaction: abort compaction if too many pages are isolated and caller is asynchronous V2	2011-06-15 20:04:02 -07:00
debug-pagealloc.c
dmapool.c	devres: fix possible use after free	2011-07-25 20:57:14 -07:00
fadvise.c
failslab.c	fault-injection: add ability to export fault_attr in arbitrary directory	2011-08-03 14:25:20 -10:00
filemap.c	mm: clarify the radix_tree exceptional cases	2011-08-03 14:25:24 -10:00
filemap_xip.c	mm: Convert i_mmap_lock to a mutex	2011-05-25 08:39:18 -07:00
fremap.c	mm: don't access vm_flags as 'int'	2011-05-26 09:20:31 -07:00
highmem.c	mm,x86: fix kmap_atomic_push vs ioremap_32.c	2010-10-27 18:03:05 -07:00
huge_memory.c	mm/huge_memory.c: minor lock simplification in __khugepaged_exit	2011-07-25 20:57:09 -07:00
hugetlb.c	mm: hugetlb: fix coding style issues	2011-07-25 20:57:09 -07:00
hwpoison-inject.c	Fix common misspellings	2011-03-31 11:26:23 -03:00
init-mm.c	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
internal.h	mm: nommu: sort mm->mmap list properly	2011-05-25 08:39:05 -07:00
Kconfig	mm Kconfig typo: cleancacne -> cleancache	2011-06-10 14:47:52 +02:00
Kconfig.debug	mm: debug-pagealloc: fix kconfig dependency warning	2011-03-22 17:44:02 -07:00
kmemcheck.c	kmemcheck: Fix build errors due to missing slab.h	2010-03-30 22:02:32 +09:00
kmemleak-test.c	kmemleak: remove memset by using kzalloc	2011-01-27 18:31:51 +00:00
kmemleak.c	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
ksm.c	ksm: fix NULL pointer dereference in scan_get_next_rmap_item()	2011-06-15 20:04:02 -07:00
maccess.c	maccess,probe_kernel: Make write/read src const void *	2011-05-25 19:56:23 -04:00
madvise.c	fs: kill i_alloc_sem	2011-07-20 20:47:46 -04:00
Makefile	mm: cleancache core ops functions and config	2011-05-26 10:01:36 -06:00
memblock.c	mm/memblock.c: avoid abuse of RED_INACTIVE	2011-07-25 20:57:09 -07:00
memcontrol.c	Revert "memcg: get rid of percpu_charge_mutex lock"	2011-08-09 17:04:43 -07:00
memory-failure.c	HWPoison: add memory_failure_queue()	2011-08-03 11:15:58 -04:00
memory.c	mm/futex: fix futex writes on archs with SW tracking of dirty & young	2011-07-25 20:57:11 -07:00
memory_hotplug.c	mm: extend memory hotplug API to allow memory hotplug in virtual machines	2011-07-25 20:57:08 -07:00
mempolicy.c	cpusets: randomize node rotor used in cpuset_mem_spread_node()	2011-07-26 16:49:43 -07:00
mempool.c
migrate.c	migrate: don't account swapcache as shmem	2011-06-16 15:01:24 -07:00
mincore.c	mm: clarify the radix_tree exceptional cases	2011-08-03 14:25:24 -10:00
mlock.c	mm: don't access vm_flags as 'int'	2011-05-26 09:20:31 -07:00
mm_init.c
mmap.c	mmap: fix and tidy up overcommit page arithmetic	2011-07-25 20:57:09 -07:00
mmu_context.c	exit: fix oops in sync_mm_rss	2010-03-24 16:31:21 -07:00
mmu_notifier.c	thp: mmu_notifier_test_young	2011-01-13 17:32:46 -08:00
mmzone.c	mm: page allocator: adjust the per-cpu counter threshold when memory is low	2011-01-13 17:32:31 -08:00
mprotect.c	thp: mprotect: transparent huge page support	2011-01-13 17:32:44 -08:00
mremap.c	mm: Convert i_mmap_lock to a mutex	2011-05-25 08:39:18 -07:00
msync.c	sanitize vfs_fsync calling conventions	2010-05-21 18:31:21 -04:00
nobootmem.c	memblock/nobootmem: remove unneeded code from alloc_bootmem_node_high()	2011-05-25 08:39:31 -07:00
nommu.c	mmap: fix and tidy up overcommit page arithmetic	2011-07-25 20:57:09 -07:00
oom_kill.c	oom: task->mm == NULL doesn't mean the memory was freed	2011-08-01 15:24:12 -10:00
page-writeback.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback	2011-07-26 10:39:54 -07:00
page_alloc.c	fault-injection: add ability to export fault_attr in arbitrary directory	2011-08-03 14:25:20 -10:00
page_cgroup.c	mm/page_cgroup.c: simplify code by using SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macros	2011-07-25 20:57:09 -07:00
page_io.c	block: kill off REQ_UNPLUG	2011-03-10 08:52:27 +01:00
page_isolation.c	mm: page_isolation: codeclean fix comment and rm unneeded val init	2010-10-26 16:52:11 -07:00
pagewalk.c	pagewalk: fix code comment for THP	2011-07-25 20:57:09 -07:00
percpu-km.c	percpu: clear memory allocated with the km allocator	2010-10-02 10:28:42 +03:00
percpu-vm.c	mm: remove gfp mask from pcpu_get_vm_areas	2011-01-13 17:32:34 -08:00
percpu.c	Merge branch 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu	2011-05-24 11:53:42 -07:00
pgtable-generic.c	mm/pgtable-generic.c: fix CONFIG_SWAP=n build	2011-01-26 10:49:58 +10:00
prio_tree.c	sanitize <linux/prefetch.h> usage	2011-05-20 12:50:29 -07:00
quicklist.c	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h	2010-03-30 22:02:32 +09:00
readahead.c	readahead: readahead page allocations are OK to fail	2011-05-25 08:39:25 -07:00
rmap.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback	2011-07-26 10:39:54 -07:00
shmem.c	mm: clarify the radix_tree exceptional cases	2011-08-03 14:25:24 -10:00
slab.c	slab, lockdep: Annotate the locks before using them	2011-08-04 10:18:00 +02:00
slob.c	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
slub.c	slub: Fix partial count comparison confusion	2011-08-09 21:12:31 +03:00
sparse-vmemmap.c	tree-wide: fix comment/printk typos	2010-11-01 15:38:34 -04:00
sparse.c	mm: make some struct page's const	2011-07-25 20:57:07 -07:00
swap.c	mm: batch activate_page() to reduce lock contention	2011-05-25 08:39:37 -07:00
swap_state.c	block: remove per-queue plugging	2011-03-10 08:52:07 +01:00
swapfile.c	mm: let swap use exceptional entries	2011-08-03 14:25:22 -10:00
thrash.c	mm: swap-token: add a comment for priority aging	2011-07-25 20:57:08 -07:00
truncate.c	mm: a few small updates for radix-swap	2011-08-03 14:25:24 -10:00
util.c	mm: nommu: sort mm->mmap list properly	2011-05-25 08:39:05 -07:00
vmalloc.c	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
vmscan.c	memcg: add memory.vmscan_stat	2011-07-26 16:49:42 -07:00
vmstat.c	mm, mem-hotplug: update pcp->stat_threshold when memory hotplug occur	2011-05-25 08:39:09 -07:00