linux/fs
Chris Snook 4e6fd33b75 [PATCH] enforce RLIMIT_NOFILE in poll()
POSIX states that poll() shall fail with EINVAL if nfds > OPEN_MAX.  In
this context, POSIX is referring to sysconf(OPEN_MAX), which is the value
of current->signal->rlim[RLIMIT_NOFILE].rlim_cur in the linux kernel, not
the compile-time constant which happens to also be named OPEN_MAX.  In the
current code, an application may poll up to max_fdset file descriptors,
even if this exceeds RLIMIT_NOFILE.  The current code also breaks
applications which poll more than max_fdset descriptors, which worked circa
2.4.18 when the check was against NR_OPEN, which is 1024*1024.  This patch
enforces the limit precisely as POSIX defines, even if RLIMIT_NOFILE has
been changed at run time with ulimit -n.

To elaborate on the rationale for this, there are three cases:

1) RLIMIT_NOFILE is at the default value of 1024

In this (default) case, the patch changes nothing.  Calls with nfds > 1024
fail with EINVAL both before and after the patch, and calls with nfds <=
1024 pass the check both before and after the patch, since 1024 is the
initial value of max_fdset.

2) RLIMIT_NOFILE has been raised above the default

In this case, poll() becomes more permissive, allowing polling up to
RLIMIT_NOFILE file descriptors even if less than 1024 have been opened.
The patch won't introduce new errors here.  If an application somehow
depends on poll() failing when it polls with duplicate or invalid file
descriptors, it's already broken, since this is already allowed below 1024,
and will also work above 1024 if enough file descriptors have been open at
some point to cause max_fdset to have been increased above nfds.

3) RLIMIT_NOFILE has been lowered below the default

In this case, the system administrator or the user has gone out of their
way to protect the system from inefficient (or malicious) applications
wasting kernel memory.  The current code allows polling up to 1024 file
descriptors even if RLIMIT_NOFILE is much lower, which is not what the user
or administrator intended.  Well-written applications which only poll
valid, unique file descriptors will never notice the difference, because
they'll hit the limit on open() first.  If an application gets broken
because of the patch in this case, then it was already poorly/maliciously
designed, and allowing it to work in the past was a violation of POSIX and
a DoS risk on low-resource systems.

With this patch, poll() will permit exactly what POSIX suggests, no more,
no less, and for any run-time value set with ulimit -n, not just 256 or
1024.  There are existing apps which which poll a large number of file
descriptors, some of which may be invalid, and if those numbers stradle
1024, they currently fail with or without the patch in -mm, though they
worked fine under 2.4.18.

Signed-off-by: Chris Snook <csnook@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:23 -07:00
..
9p [PATCH] 9p: fix leak on error path 2006-09-29 09:18:20 -07:00
adfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
affs [PATCH] Really ignore kmem_cache_destroy return value 2006-09-27 08:26:10 -07:00
afs [PATCH] afs: add lock annotations to afs_proc_cell_servers_{start,stop} 2006-09-29 09:18:07 -07:00
autofs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
autofs4 [PATCH] autofs4: pending flag not cleared on mount fail 2006-09-29 09:18:18 -07:00
befs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
bfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
cifs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
coda [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
configfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
cramfs [PATCH] cramfs: make cramfs_uncompress_exit() return void 2006-09-29 09:18:20 -07:00
debugfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
devpts [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
efs [PATCH] Really ignore kmem_cache_destroy return value 2006-09-27 08:26:10 -07:00
exportfs [PATCH] NFS server subtree_check returns dubious value 2006-05-21 12:59:16 -07:00
ext2 [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
ext3 [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
fat [PATCH] add -o flush for fat 2006-09-29 09:18:12 -07:00
freevxfs [PATCH] freevxfs: fix leak on error path 2006-09-29 09:18:20 -07:00
fuse [PATCH] vfs: define new lookup flag for chdir 2006-09-29 09:18:08 -07:00
hfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
hfsplus [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
hostfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
hpfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
hppfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
hugetlbfs [PATCH] hugetlbfs: add lock annotation to hugetlbfs_forget_inode() 2006-09-29 09:18:08 -07:00
isofs [PATCH] I/O Error attempting to read last partial block of a file in an ISO9660 file system 2006-09-29 09:18:15 -07:00
jbd [PATCH] JBD: memory leak in "journal_init_dev()" 2006-09-29 09:18:03 -07:00
jffs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
jffs2 [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
jfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
lockd [PATCH] add newline to nfs dprintk 2006-09-27 08:26:19 -07:00
minix [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
msdos [PATCH] add -o flush for fat 2006-09-29 09:18:12 -07:00
ncpfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
nfs [PATCH] fs/nfs/: make code static 2006-09-27 08:26:20 -07:00
nfs_common [PATCH] nfsacl: Solaris VxFS compatibility fix 2005-10-11 09:46:54 -07:00
nfsd [PATCH] Really ignore kmem_cache_destroy return value 2006-09-27 08:26:10 -07:00
nls Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00
ntfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
ocfs2 [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
openpromfs Move several *_SUPER_MAGIC symbols to include/linux/magic.h. 2006-09-24 11:13:19 -04:00
partitions [PATCH] ignore partition table on disks with AIX label 2006-09-29 09:18:09 -07:00
proc [PATCH] fix mem_write() return value 2006-09-29 09:18:19 -07:00
qnx4 [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
ramfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
reiserfs [PATCH] reiserfs: ifdef ACL stuff from inode 2006-09-29 09:18:11 -07:00
romfs [PATCH] Really ignore kmem_cache_destroy return value 2006-09-27 08:26:10 -07:00
smbfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
sysfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
sysv [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
udf [PATCH] mount udf UDF_PART_FLAG_READ_ONLY partitions with MS_RDONLY 2006-09-29 09:18:09 -07:00
ufs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
vfat [PATCH] VFS: Permit filesystem to override root dentry on mount 2006-06-23 07:42:45 -07:00
xfs [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
Kconfig [PATCH] sysctl: Allow /proc/sys without sys_sysctl 2006-09-27 08:26:19 -07:00
Kconfig.binfmt [PATCH] frv: suppress configuration of certain features for FRV 2006-01-08 20:13:36 -08:00
Makefile [PATCH] devfs: Remove devfs from the kernel tree 2006-06-26 12:25:05 -07:00
aio.c spelling fixes 2006-06-26 18:35:02 +02:00
attr.c [PATCH] capable/capability.h (fs/) 2006-01-11 18:42:13 -08:00
bad_inode.c [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
binfmt_aout.c [PATCH] Require mmap handler for a.out executables 2006-09-29 09:18:08 -07:00
binfmt_elf.c [PATCH] elf_core_dump: don't take tasklist_lock 2006-09-29 09:18:14 -07:00
binfmt_elf_fdpic.c [PATCH] elf_fdpic_core_dump: don't take tasklist_lock 2006-09-29 09:18:14 -07:00
binfmt_em86.c
binfmt_flat.c Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00
binfmt_misc.c [PATCH] Fix unserialized task->files changing 2006-09-29 09:18:12 -07:00
binfmt_script.c
binfmt_som.c Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00
bio.c [PATCH] Fix missing ret assignment in __bio_map_user() error path 2006-06-17 10:52:12 -07:00
block_dev.c [PATCH] block_dev.c mutex_lock_nested() fix 2006-09-29 09:18:19 -07:00
buffer.c [PATCH] mm: tracking shared dirty pages 2006-09-26 08:48:44 -07:00
char_dev.c [PATCH] cdev documentation 2006-09-29 09:18:16 -07:00
compat.c [PATCH] Check return value of copy_to_user in compat_sys_pselect7 2006-09-26 10:52:39 +02:00
compat_ioctl.c Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00
dcache.c NFS: Add dentry materialisation op 2006-09-22 23:24:30 -04:00
dcookies.c Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00
direct-io.c [PATCH] lockdep: annotate direct io 2006-07-03 15:27:06 -07:00
dnotify.c [PATCH] Use __read_mostly on some hot fs variables 2006-03-26 08:56:56 -08:00
dquot.c [PATCH] dquot: add proper locking when using current->signal->tty 2006-09-29 09:18:14 -07:00
drop_caches.c [PATCH] drop-pagecache 2006-01-08 20:12:40 -08:00
eventpoll.c [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
exec.c [PATCH] Fix unserialized task->files changing 2006-09-29 09:18:12 -07:00
fcntl.c BUG_ON() Conversion in fs/fcntl.c 2006-04-02 13:37:19 +02:00
fifo.c [PATCH] pipe.c/fifo.c code cleanups 2006-04-11 13:53:33 +02:00
file.c [PATCH] alloc_fdtable() cleanup 2006-09-27 08:26:19 -07:00
file_table.c [PATCH] inode-diet: Move i_cdev into a union 2006-09-27 08:26:17 -07:00
filesystems.c [PATCH] Ban register_filesystem(NULL); 2006-09-29 09:18:20 -07:00
fs-writeback.c [PATCH] zoned vm counters: conversion of nr_unstable to per zone counter 2006-06-30 11:25:36 -07:00
inode.c [PATCH] fs.h: ifdef security fields 2006-09-29 09:18:11 -07:00
inotify.c [PATCH] inotify (4/5): allow watch removal from event handler 2006-06-20 05:25:19 -04:00
inotify_user.c [PATCH] inotify: fix deadlock found by lockdep 2006-07-31 13:28:41 -07:00
ioctl.c Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00
ioprio.c [PATCH] uninline ioprio_best() 2006-08-21 10:02:50 +02:00
libfs.c [PATCH] libfs: remove page up-to-date check from simple_readpage 2006-09-29 09:18:06 -07:00
locks.c [PATCH] fcntl(F_SETSIG) fix 2006-08-14 13:10:59 -07:00
mbcache.c [PATCH] mbcache: add lock annotation for __mb_cache_entry_release_unlock() 2006-09-29 09:18:07 -07:00
mpage.c [PATCH] writeback: fix range handling 2006-06-23 07:42:49 -07:00
namei.c [PATCH] fs/namei.c: replace multiple current->fs by shortcut variable 2006-09-29 09:18:22 -07:00
namespace.c [PATCH] fs/namespace: handle init/registration errors 2006-09-29 09:18:05 -07:00
nfsctl.c Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00
open.c [PATCH] fix wrong error code on interrupted close syscalls 2006-09-29 09:18:13 -07:00
pipe.c [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
pnode.c [PATCH] core: use list_move() 2006-06-26 09:58:17 -07:00
pnode.h
posix_acl.c
quota.c [PATCH] sem2mutex: quota 2006-03-23 07:38:11 -08:00
quota_v1.c
quota_v2.c [PATCH] sem2mutex: quota 2006-03-23 07:38:11 -08:00
read_write.c [PATCH] fs/read_write.c: EXPORT_UNUSED_SYMBOL 2006-07-10 13:24:18 -07:00
readdir.c [PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem 2006-01-09 15:59:24 -08:00
select.c [PATCH] enforce RLIMIT_NOFILE in poll() 2006-09-29 09:18:23 -07:00
seq_file.c [PATCH] sem2mutex: fs/seq_file.c 2006-03-23 07:38:12 -08:00
splice.c [PATCH] splice: fix problems with sys_tee() 2006-07-10 11:00:01 +02:00
stat.c [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
super.c [PATCH] fs: add lock annotation to grab_super 2006-09-29 09:18:08 -07:00
sync.c [PATCH] writeback: fix range handling 2006-06-23 07:42:49 -07:00
xattr.c [PATCH] log more info for directory entry change events 2006-06-20 05:25:28 -04:00
xattr_acl.c