linux/mm
Hugh Dickins 9b15b817f3 swap: fix shmem swapping when more than 8 areas
Minchan Kim reports that when a system has many swap areas, and tmpfs
swaps out to the ninth or more, shmem_getpage_gfp()'s attempts to read
back the page cannot locate it, and the read fails with -ENOMEM.

Whoops.  Yes, I blindly followed read_swap_header()'s pte_to_swp_entry(
swp_entry_to_pte()) technique for determining maximum usable swap
offset, without stopping to realize that that actually depends upon the
pte swap encoding shifting swap offset to the higher bits and truncating
it there.  Whereas our radix_tree swap encoding leaves offset in the
lower bits: it's swap "type" (that is, index of swap area) that was
truncated.

Fix it by reducing the SWP_TYPE_SHIFT() in swapops.h, and removing the
broken radix_to_swp_entry(swp_to_radix_entry()) from read_swap_header().

This does not reduce the usable size of a swap area any further, it
leaves it as claimed when making the original commit: no change from 3.0
on x86_64, nor on i386 without PAE; but 3.0's 512GB is reduced to 128GB
per swapfile on i386 with PAE.  It's not a change I would have risked
five years ago, but with x86_64 supported for ten years, I believe it's
appropriate now.

Hmm, and what if some architecture implements its swap pte with offset
encoded below type? That would equally break the maximum usable swap
offset check.  Happily, they all follow the same tradition of encoding
offset above type, but I'll prepare a check on that for next.

Reported-and-Reviewed-and-Tested-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org [3.1, 3.2, 3.3, 3.4]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-15 21:48:14 -07:00
..
Kconfig Frontswap provides a "transcendent memory" interface for swap pages. 2012-06-04 12:28:45 -07:00
Kconfig.debug
Makefile Frontswap provides a "transcendent memory" interface for swap pages. 2012-06-04 12:28:45 -07:00
backing-dev.c
bootmem.c mm/bootmem.c: cleanup on addition to bootmem data list 2012-05-29 16:22:24 -07:00
bounce.c
cleancache.c ->encode_fh() API change 2012-05-29 23:28:33 -04:00
compaction.c Revert "mm: compaction: handle incorrect MIGRATE_UNMOVABLE type pageblocks" 2012-06-03 20:05:57 -07:00
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-06-01 10:34:35 -07:00
filemap_xip.c fs: introduce inode operation ->update_time 2012-06-01 12:07:25 -04:00
fremap.c
frontswap.c
highmem.c
huge_memory.c mm/memcg: apply add/del_page to lruvec 2012-05-29 16:22:28 -07:00
hugetlb.c mm: fix vma_resv_map() NULL pointer 2012-05-30 08:48:13 -07:00
hwpoison-inject.c
init-mm.c
internal.h Revert "mm: compaction: handle incorrect MIGRATE_UNMOVABLE type pageblocks" 2012-06-03 20:05:57 -07:00
kmemcheck.c
kmemleak-test.c
kmemleak.c
ksm.c
maccess.c
madvise.c mm/fs: route MADV_REMOVE to FALLOC_FL_PUNCH_HOLE 2012-05-29 16:22:22 -07:00
memblock.c memblock: Document memblock_is_region_{memory,reserved}() 2012-06-08 11:59:46 +02:00
memcontrol.c memcg: decrement static keys at real destroy time 2012-05-29 16:22:28 -07:00
memory-failure.c mm/memory_failure: let the compiler add the function name 2012-05-29 16:22:18 -07:00
memory.c thp, memcg: split hugepage for memcg oom on cow 2012-05-29 16:22:19 -07:00
memory_hotplug.c mm: print physical addresses consistently with other parts of kernel 2012-05-29 16:22:21 -07:00
mempolicy.c mm: do_migrate_pages(): rename arguments 2012-05-29 16:22:20 -07:00
mempool.c
migrate.c mm: fix warning in __set_page_dirty_nobuffers 2012-06-03 20:05:47 -07:00
mincore.c
mlock.c
mm_init.c
mmap.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-06-01 10:34:35 -07:00
mmu_context.c
mmu_notifier.c
mmzone.c mm: add link from struct lruvec to struct zone 2012-05-29 16:22:26 -07:00
mprotect.c
mremap.c move security_mmap_addr() to saner place 2012-06-01 10:37:16 -04:00
msync.c
nobootmem.c mm: remove sparsemem allocation details from the bootmem allocator 2012-05-29 16:22:22 -07:00
nommu.c nommu: fix compilation of nommu.c 2012-06-04 17:17:31 -04:00
oom_kill.c mm, oom: fix badness score underflow 2012-06-08 15:07:35 -07:00
page-writeback.c
page_alloc.c Revert "mm: compaction: handle incorrect MIGRATE_UNMOVABLE type pageblocks" 2012-06-03 20:05:57 -07:00
page_cgroup.c
page_io.c
page_isolation.c mm: page_isolation: MIGRATE_CMA isolation functions added 2012-05-21 15:09:33 +02:00
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c arch/tile: allow building Linux with transparent huge pages enabled 2012-05-25 12:48:21 -04:00
prio_tree.c
process_vm_access.c aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() 2012-05-31 17:49:32 -07:00
quicklist.c
readahead.c mm: move readahead syscall to mm/readahead.c 2012-05-29 16:22:23 -07:00
rmap.c mm: remove swap token code 2012-05-29 16:22:19 -07:00
shmem.c shmem: replace_page must flush_dcache and others 2012-06-07 14:43:54 -07:00
slab.c
slob.c
slub.c Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2012-06-01 16:50:23 -07:00
sparse-vmemmap.c
sparse.c mm: remove sparsemem allocation details from the bootmem allocator 2012-05-29 16:22:22 -07:00
swap.c mm/memcg: apply add/del_page to lruvec 2012-05-29 16:22:28 -07:00
swap_state.c
swapfile.c swap: fix shmem swapping when more than 8 areas 2012-06-15 21:48:14 -07:00
truncate.c mm/fs: remove truncate_range 2012-05-29 16:22:23 -07:00
util.c new helper: vm_mmap_pgoff() 2012-06-01 10:37:18 -04:00
vmalloc.c mm: fix faulty initialization in vmalloc_init() 2012-05-29 16:22:24 -07:00
vmscan.c mm/memcg: apply add/del_page to lruvec 2012-05-29 16:22:28 -07:00
vmstat.c mm/vmstat.c: remove debug fs entries on failure of file creation and made extfrag_debug_root dentry local 2012-05-29 16:22:19 -07:00