linux

Commit Graph

Author	SHA1	Message	Date
Alexey Dobriyan	1c6ace019b	fs/Kconfig: move fat out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:15:55 +03:00
Alexey Dobriyan	ddfaccd995	fs/Kconfig: move iso9660, udf out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:15:55 +03:00
Alexey Dobriyan	3ef7784e47	fs/Kconfig: move fuse out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:15:55 +03:00
Alexey Dobriyan	90ffd46793	fs/Kconfig: move autofs, autofs4 out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:15:54 +03:00
Alexey Dobriyan	335debee07	fs/Kconfig: move btrfs out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:15:54 +03:00
Alexey Dobriyan	2fe4371dff	fs/Kconfig: move ocfs2 out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:15:54 +03:00
Alexey Dobriyan	f5c77969b3	fs/Kconfig: move jfs out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:15:54 +03:00
Alexey Dobriyan	b16ecfe2f9	fs/Kconfig: move reiserfs out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:15:53 +03:00
Dave Chinner	74e2d06521	Long btree pointers are still 64 bit on disk [XFS] Long btree pointers are still 64 bit on disk On 32 bit machines with CONFIG_LBD=n, XFS reduces the in memory size of xfs_fsblock_t to 32 bits so that it will fit within 32 bit addressing. However, the disk format for long btree pointers are still 64 bits in size. The recent btree rewrite failed to take this into account when initialising new btree blocks, setting sibling pointers to NULL and checking if they are NULL. Hence checking whether a 64 bit NULL was the same as a 32 bit NULL was failingi resulting in NULL sibling pointers failing to be detected correctly. This showed up as WANT_CORRUPTED_GOTO shutdowns in xfs_btree_delrec. Fix this by making all the comparisons and setting of long pointer btree NULL blocks to the disk format, not the in memory format. i.e. use NULLDFSBNO. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Reported-by: Jacek Luczak <difrost.kernel@gmail.com> Reported-by: Danny ter Haar <dth@dth.net> Tested-by: Jacek Luczak <difrost.kernel@gmail.com> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-01-22 01:23:11 -06:00
Jeff Layton	20d5a39929	dlm: initialize file_lock struct in GETLK before copying conflicting lock dlm_posix_get fills out the relevant fields in the file_lock before returning when there is a lock conflict, but doesn't clean out any of the other fields in the file_lock. When nfsd does a NFSv4 lockt call, it sets the fl_lmops to nfsd_posix_mng_ops before calling the lower fs. When the lock comes back after testing a lock on GFS2, it still has that field set. This confuses nfsd into thinking that the file_lock is a nfsd4 lock. Fix this by making DLM reinitialize the file_lock before copying the fields from the conflicting lock. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2009-01-21 15:28:45 -06:00
David Teigland	24179f4880	dlm: fix plock notify callback to lockd We should use the original copy of the file_lock, fl, instead of the copy, flc in the lockd notify callback. The range in flc has been modified by posix_lock_file(), so it will not match a copy of the lock in lockd. Signed-off-by: David Teigland <teigland@redhat.com>	2009-01-21 15:28:45 -06:00
Yehuda Sadeh	1506fcc818	Btrfs: fiemap support Now that bmap support is gone, this is the only way to get extent mappings for userland. These are still not valid for IO, but they can tell us if a file has holes or how much fragmentation there is. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>	2009-01-21 14:39:14 -05:00
Chris Mason	35054394c4	Btrfs: stop providing a bmap operation to avoid swapfile corruptions Swapfiles use bmap to build a list of extents belonging to the file, and they assume these extents won't change over the life of the file. They also use resulting list to do IO directly to the block device. This causes problems for btrfs in a few ways: btrfs returns logical block numbers through bmap, and these are not suitable for IO. They might translate to different devices, raid etc. COW means that file block mappings are going to change frequently. Using swapfiles on btrfs will lead to corruption, so we're avoiding the problem for now by dropping bmap support entirely. A later commit will add fiemap support for people that really want to know how a file is laid out. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-21 13:11:13 -05:00
Yan Zheng	7237f18336	Btrfs: fix tree logs parallel sync To improve performance, btrfs_sync_log merges tree log sync requests. But it wrongly merges sync requests for different tree logs. If multiple tree logs are synced at the same time, only one of them actually gets synced. This patch has following changes to fix the bug: Move most tree log related fields in btrfs_fs_info to btrfs_root. This allows merging sync requests separately for each tree log. Don't insert root item into the log root tree immediately after log tree is allocated. Root item for log tree is inserted when log tree get synced for the first time. This allows syncing the log root tree without first syncing all log trees. At tree-log sync, btrfs_sync_log first sync the log tree; then updates corresponding root item in the log root tree; sync the log root tree; then update the super block. Signed-off-by: Yan Zheng <zheng.yan@oracle.com>	2009-01-21 12:54:03 -05:00
Qinghuang Feng	7e6628544a	Btrfs: open_ctree() error handling can oops on fs_info a bug in open_ctree: struct btrfs_root *open_ctree(..) { .... if (!extent_root \|\| !tree_root \|\| !fs_info \|\| !chunk_root \|\| !dev_root \|\| !csum_root) { err = -ENOMEM; goto fail; //When code flow goes to "fail", fs_info may be NULL or uninitialized. } .... fail: btrfs_close_devices(fs_info->fs_devices);// ! btrfs_mapping_tree_free(&fs_info->mapping_tree);// ! kfree(extent_root); kfree(tree_root); bdi_destroy(&fs_info->bdi);// ! ... ) Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-21 10:49:16 -05:00
Yan Zheng	86288a198d	Btrfs: fix stop searching test in replace_one_extent replace_one_extent searches tree leaves for references to a given extent. It stops searching if it goes beyond the last possible position. The last possible position is computed by adding the starting offset of a found file extent to the full size of the extent. The code uses physical size of the extent as the full size. This is incorrect when compression is used. The fix is get the full size from ram_bytes field of file extent item. Signed-off-by: Yan Zheng <zheng.yan@oracle.com>	2009-01-21 10:49:16 -05:00
Jan Engelhardt	95029d7d59	Btrfs: change/remove typedef Change one typedef to a regular enum, and remove an unused one. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-21 10:49:16 -05:00
Huang Weiyi	653249ff9a	Btrfs: remove duplicated #include Removed duplicated #include "compat.h"in fs/btrfs/extent-tree.c Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-21 10:49:16 -05:00
Yan Zheng	5a7be515b1	Btrfs: Fix infinite loop in btrfs_extent_post_op btrfs_extent_post_op calls finish_current_insert and del_pending_extents. They both may enter infinite loops. finish_current_insert enters infinite loop if it only finds some backrefs to update. The fix is to check for pending backref updates before restarting the loop. The infinite loop in del_pending_extents is due to a the skipped variable not being properly reset before looping around. Signed-off-by: Yan Zheng <zheng.yan@oracle.com>	2009-01-21 10:49:16 -05:00
Yan Zheng	3dfdb9348a	Btrfs: fix locking issue in btrfs_remove_block_group We should hold the block_group_cache_lock while modifying the block groups red-black tree. Thank you, Signed-off-by: Yan Zheng <zheng.yan@oracle.com>	2009-01-21 10:49:16 -05:00
Qinghuang Feng	c6e308713a	Btrfs: simplify iteration codes Merge list_for_each* and list_entry to list_for_each_entry* Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-21 10:59:08 -05:00
Qinghuang Feng	57506d50ed	Btrfs: check return value for kthread_run() correctly kthread_run() returns the kthread or ERR_PTR(-ENOMEM), not NULL. Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-21 10:49:16 -05:00
Roland Dreier	119e10cf1b	Btrfs: Remove extra KERN_INFO in the middle of a line The "devid <xxx> transid <xxx>" printk in btrfs_scan_one_device() actually follows another printk that doesn't end in a newline (since the intention is for the two printks to make one line of output), so the KERN_INFO just ends up messing up the output: device label exp <6>devid 1 transid 9 /dev/sda5 Fix this by changing the extra KERN_INFO to KERN_CONT. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-21 10:49:16 -05:00
Huang Weiyi	7eaebe7d50	Btrfs: removed unused #include <version.h>'s Removed unused #include <version.h>'s in btrfs Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-21 10:49:16 -05:00
Josef Bacik	070604040b	Btrfs: cleanup xattr code Andrew's review of the xattr code revealed some minor issues that this patch addresses. Just an error return fix, got rid of a useless statement and commented one of the trickier parts of __btrfs_getxattr. Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-21 10:49:16 -05:00
Wang Cong	19d00cc196	Btrfs: cleanup fs/btrfs/super.c::btrfs_control_ioctl() - Remove the unused local variable 'len'; - Check return value of kmalloc(). Signed-off-by: Wang Cong <wangcong@zeuux.org> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-21 10:49:16 -05:00
Jan Kara	c475146d8f	ocfs2: Remove ocfs2_dquot_initialize() and ocfs2_dquot_drop() Since ->acquire_dquot and ->release_dquot callbacks aren't called under dqptr_sem anymore, we don't have to start a transaction and obtain locks so early. So we can just remove all this complicated stuff. Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Mark Fasheh <mfasheh@suse.de>	2009-01-21 15:25:57 +01:00
Greg Kroah-Hartman	4503efd089	sysfs: fix problems with binary files Some sysfs binary files don't like having 0 passed to them as a size. Fix this up at the root by just returning to the vfs if userspace asks us for a zero sized buffer. Thanks to Pavel Roskin for pointing this out. Reported-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-20 20:52:09 -08:00
Eric Sandeen	b6e3222732	[XFS] Remove the rest of the macro-to-function indirections. Remove the last of the macros-defined-to-static-functions. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2009-01-19 14:45:55 +11:00
Christoph Hellwig	b828d8c338	xfs: sanity check attr fork size Recently we have quite a few kerneloops reports about dereferencing a NULL if_data in the attribute fork. From looking over the code this can only happen if we pass a 0 size argument to xfs_iformat_local. This implies some sort of corruption and in fact the only mailinglist report about this from earlier this year was after a powerfail presumably on a system with write cache and without barriers. Add a quick sanity check for the attr fork size in xfs_iformat to catch these early and without an oops. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>	2009-01-19 14:45:11 +11:00
Christoph Hellwig	49739140e5	xfs: fix bad_features2 fixups for the root filesystem Currently the bad_features2 fixup and the alignment updates in the superblock are skipped if we mount a filesystem read-only. But for the root filesystem the typical case is to mount read-only first and only later remount writeable so we'll never perform this update at all. It's not a big problem but means the logs of people needing the fixup get spammed at every boot because they never happen on disk. Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>	2009-01-19 14:45:04 +11:00
Christoph Hellwig	5aa2dc0a06	xfs: add a lock class for group/project dquots We can have both a user and a group/project dquot locked at the same time, as long as the user dquot is locked first. Tell lockdep about that fact by making the group/project dquots a different lock class. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>	2009-01-19 14:44:59 +11:00
Christoph Hellwig	4f2d4ac6e5	xfs: lockdep annotations for xfs_dqlock2 xfs_dqlock2 locks two xfs_dquots, which is fine as it always locks the dquot with the lower id first. Use mutex_lock_nested to tell lockdep about this fact. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>	2009-01-19 14:44:52 +11:00
Christoph Hellwig	080dda7f5e	xfs: add a separate lock class for the per-mount list of dquots We can have both a a quota hash chain and the per-mount list locked at the same time. But given that both use the same struct dqhash as list head we have to tell lockdep that they are different lock classes. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>	2009-01-19 14:44:44 +11:00
Christoph Hellwig	62e194ecda	xfs: use mnt_want_write in compat_attrmulti ioctl The compat version of the attrmulti ioctl needs to ask for and then later release write access to the mount just like the native version, otherwise we could potentially write to read-only mounts. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>	2009-01-19 14:44:30 +11:00
Christoph Hellwig	ab596ad897	xfs: fix dentry aliasing issues in open_by_handle Open by handle just grabs an inode by handle and then creates itself a dentry for it. While this works for regular files it is horribly broken for directories, where the VFS locking relies on the fact that there is only just one single dentry for a given inode, and that these are always connected to the root of the filesystem so that it's locking algorithms work (see Documentations/filesystems/Locking) Remove all the existing open by handle code and replace it with a small wrapper around the exportfs code which deals with all these issues. At the same time we also make the checks for a valid handle strict enough to reject all not perfectly well formed handles - given that we never hand out others that's okay and simplifies the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com>	2009-01-19 14:43:18 +11:00
Linus Torvalds	4b48d9d44e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: fix ioctl arg size (userland incompatible change!) Btrfs: Clear the device->running_pending flag before bailing on congestion	2009-01-16 09:32:33 -08:00
Jan Kara	cc33412fb1	quota: Improve locking We implement dqget() and dqput() that need neither dqonoff_mutex nor dqptr_sem. Then move dqget() and dqput() calls so that they are not called from under dqptr_sem. This is important because filesystem callbacks aren't called from under dqptr_sem which used to cause lots of problems with lock ranking (and with OCFS2 they became close to unsolvable). The patch also removes two functions which were introduced solely because OCFS2 needed them to cope with the old locking scheme. As time showed, they were not enough for OCFS2 anyway and it would be unnecessary work to adapt them to the new locking scheme in which they aren't needed. As a result OCFS2 needs the following patch to compile properly with quotas. Sorry to any bisecters which hit this in advance. Signed-off-by: Jan Kara <jack@suse.cz>	2009-01-16 18:02:10 +01:00
Chris Mason	c071fcfdb6	Btrfs: fix ioctl arg size (userland incompatible change!) The structure used to send device in btrfs ioctl calls was not properly aligned, and so 32 bit ioctls would not work properly on 64 bit kernels. We could fix this with compat ioctls, but we're just one byte away and it doesn't make sense at this stage to carry about the compat ioctls forever at this stage in the project. This patch brings the ioctl arg up to an evenly aligned 4k. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-16 11:59:08 -05:00
Chris Mason	1d9e2ae949	Btrfs: Clear the device->running_pending flag before bailing on congestion Btrfs maintains a queue of async bio submissions so the checksumming threads don't have to wait on get_request_wait. In order to avoid extra wakeups, this code has a running_pending flag that is used to tell new submissions they don't need to wake the thread. When the threads notice congestion on a single device, they may decide to requeue the job and move on to other devices. This makes sure the running_pending flag is cleared before the job is requeued. It should help avoid IO stalls by making sure the task is woken up when new submissions come in. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-16 11:58:19 -05:00
Jan Kara	6b7021ef7e	ext2: also update the inode on disk when dir is IS_DIRSYNC We used to just write changed page for IS_DIRSYNC inodes. But we also have to update the directory inode itself just for the case that we've allocated a new block and changed i_size. [akpm@linux-foundation.org: still sync the data page] Signed-off-by: Jan Kara <jack@suse.cz> Tested-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-15 16:39:42 -08:00
Qinghuang Feng	1bcbf31337	btrfs & squashfs: Move btrfs and squashfsto's magic number to <linux/magic.h> Use the standard magic.h for btrfs and squashfs. Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com> Cc: Phillip Lougher <phillip@lougher.demon.co.uk> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-15 16:39:38 -08:00
Linus Torvalds	bca268565f	Merge branch 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 * 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (44 commits) [CVE-2009-0029] s390 specific system call wrappers [CVE-2009-0029] System call wrappers part 33 [CVE-2009-0029] System call wrappers part 32 [CVE-2009-0029] System call wrappers part 31 [CVE-2009-0029] System call wrappers part 30 [CVE-2009-0029] System call wrappers part 29 [CVE-2009-0029] System call wrappers part 28 [CVE-2009-0029] System call wrappers part 27 [CVE-2009-0029] System call wrappers part 26 [CVE-2009-0029] System call wrappers part 25 [CVE-2009-0029] System call wrappers part 24 [CVE-2009-0029] System call wrappers part 23 [CVE-2009-0029] System call wrappers part 22 [CVE-2009-0029] System call wrappers part 21 [CVE-2009-0029] System call wrappers part 20 [CVE-2009-0029] System call wrappers part 19 [CVE-2009-0029] System call wrappers part 18 [CVE-2009-0029] System call wrappers part 17 [CVE-2009-0029] System call wrappers part 16 [CVE-2009-0029] System call wrappers part 15 ...	2009-01-14 19:58:40 -08:00
Heiko Carstens	2b66421995	[CVE-2009-0029] System call wrappers part 33 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:32 +01:00
Heiko Carstens	d4e82042c4	[CVE-2009-0029] System call wrappers part 32 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:31 +01:00
Heiko Carstens	836f92adf1	[CVE-2009-0029] System call wrappers part 31 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:31 +01:00
Heiko Carstens	6559eed8ca	[CVE-2009-0029] System call wrappers part 30 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:30 +01:00
Heiko Carstens	2e4d0924eb	[CVE-2009-0029] System call wrappers part 29 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:30 +01:00
Heiko Carstens	938bb9f5e8	[CVE-2009-0029] System call wrappers part 28 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:30 +01:00
Heiko Carstens	1e7bfb2134	[CVE-2009-0029] System call wrappers part 27 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:29 +01:00
Heiko Carstens	5a8a82b1d3	[CVE-2009-0029] System call wrappers part 23 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:28 +01:00
Heiko Carstens	20f37034fb	[CVE-2009-0029] System call wrappers part 21 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:26 +01:00
Heiko Carstens	3cdad42884	[CVE-2009-0029] System call wrappers part 20 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:26 +01:00
Heiko Carstens	003d7ab479	[CVE-2009-0029] System call wrappers part 19 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:26 +01:00
Heiko Carstens	ca013e945b	[CVE-2009-0029] System call wrappers part 17 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:25 +01:00
Heiko Carstens	002c8976ee	[CVE-2009-0029] System call wrappers part 16 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:25 +01:00
Heiko Carstens	a26eab2400	[CVE-2009-0029] System call wrappers part 15 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:24 +01:00
Heiko Carstens	3480b25743	[CVE-2009-0029] System call wrappers part 14 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:24 +01:00
Heiko Carstens	6a6160a7b5	[CVE-2009-0029] System call wrappers part 13 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:23 +01:00
Heiko Carstens	64fd1de3d8	[CVE-2009-0029] System call wrappers part 12 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:23 +01:00
Heiko Carstens	257ac264d6	[CVE-2009-0029] System call wrappers part 11 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:23 +01:00
Heiko Carstens	bdc480e3be	[CVE-2009-0029] System call wrappers part 10 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:22 +01:00
Heiko Carstens	a5f8fa9e9b	[CVE-2009-0029] System call wrappers part 09 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:21 +01:00
Heiko Carstens	6673e0c3fb	[CVE-2009-0029] System call wrapper special cases System calls with an unsigned long long argument can't be converted with the standard wrappers since that would include a cast to long, which in turn means that we would lose the upper 32 bit on 32 bit architectures. Also semctl can't use the standard wrapper since it has a 'union' parameter. So we handle them as special case and add some extra wrappers instead. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:18 +01:00
Heiko Carstens	c9da9f2129	[CVE-2009-0029] Make sys_pselect7 static Not a single architecture has wired up sys_pselect7 plus it is the only system call with seven parameters. Just make it static and rename it to do_pselect which will do the work for sys_pselect6. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:16 +01:00
Heiko Carstens	1134723e96	[CVE-2009-0029] Remove __attribute__((weak)) from sys_pipe/sys_pipe2 Remove __attribute__((weak)) from common code sys_pipe implemantation. IA64, ALPHA, SUPERH (32bit) and SPARC (32bit) have own implemantations with the same name. Just rename them. For sys_pipe2 there is no architecture specific implementation. Cc: Richard Henderson <rth@twiddle.net> Cc: David S. Miller <davem@davemloft.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:15 +01:00
Heiko Carstens	e55380edf6	[CVE-2009-0029] Rename old_readdir to sys_old_readdir This way it matches the generic system call name convention. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:15 +01:00
Heiko Carstens	2ed7c03ec1	[CVE-2009-0029] Convert all system calls to return a long Convert all system calls to return a long. This should be a NOP since all converted types should have the same size anyway. With the exception of sys_exit_group which returned void. But that doesn't matter since the system call doesn't return. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2009-01-14 14:15:14 +01:00
Lachlan McIlroy	cb7a97d015	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-linus	2009-01-14 16:29:51 +11:00
Bernd Schmidt	62568510b8	Fix timeouts in sys_pselect7 Since we (Analog Devices) updated our Blackfin kernel to 2.6.28, we've seen occasional 5-second hangs from telnet. telnetd calls select with a NULL timeout, but with the new kernel, the system call occasionally returns 0, which causes telnet to call sleep (5). This did not happen with earlier kernels. The code in sys_pselect7 looks a bit strange, in particular the variable "to" is initialized to NULL, then changed if a non-null timeout was passed in, but not used further. It needs to be passed to core_sys_select instead of &end_time. This bug was introduced by `8ff3e8e85f` ("select: switch select() and poll() over to hrtimers"). Signed-off-by: Bernd Schmidt <bernd.schmidt@analog.com> Reviewed-by: Ulrich Drepper <drepper@redhat.com> Tested-by: Robin Getz <rgetz@blackfin.uclinux.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-13 14:45:17 -08:00
Linus Torvalds	c69e8839c2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: change rsbtbl rwlock to spinlock dlm: fix seq_file usage in debugfs lock dump	2009-01-12 15:54:27 -08:00
Linus Torvalds	0176260fc3	btrfs: fix for write_super_lockfs/unlockfs error handling Commit `c4be0c1dc4` added the ability for write_super_lockfs to return errors, and renamed them to match. But btrfs didn't get converted. Do the minimal conversion to make it compile again. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-10 06:09:52 -08:00
Takashi Sato	8e961870bb	filesystem freeze: remove XFS specific ioctl interfaces for freeze feature It removes XFS specific ioctl interfaces and request codes for freeze feature. This patch has been supplied by David Chinner. Signed-off-by: Dave Chinner <dgc@sgi.com> Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com> Cc: Dave Chinner <david@fromorbit.com> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-ext4@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-09 16:54:42 -08:00
Takashi Sato	fcccf50254	filesystem freeze: implement generic freeze feature The ioctls for the generic freeze feature are below. o Freeze the filesystem int ioctl(int fd, int FIFREEZE, arg) fd: The file descriptor of the mountpoint FIFREEZE: request code for the freeze arg: Ignored Return value: 0 if the operation succeeds. Otherwise, -1 o Unfreeze the filesystem int ioctl(int fd, int FITHAW, arg) fd: The file descriptor of the mountpoint FITHAW: request code for unfreeze arg: Ignored Return value: 0 if the operation succeeds. Otherwise, -1 Error number: If the filesystem has already been unfrozen, errno is set to EINVAL. [akpm@linux-foundation.org: fix CONFIG_BLOCK=n] Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com> Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-ext4@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-09 16:54:42 -08:00
Takashi Sato	c4be0c1dc4	filesystem freeze: add error handling of write_super_lockfs/unlockfs Currently, ext3 in mainline Linux doesn't have the freeze feature which suspends write requests. So, we cannot take a backup which keeps the filesystem's consistency with the storage device's features (snapshot and replication) while it is mounted. In many case, a commercial filesystem (e.g. VxFS) has the freeze feature and it would be used to get the consistent backup. If Linux's standard filesystem ext3 has the freeze feature, we can do it without a commercial filesystem. So I have implemented the ioctls of the freeze feature. I think we can take the consistent backup with the following steps. 1. Freeze the filesystem with the freeze ioctl. 2. Separate the replication volume or create the snapshot with the storage device's feature. 3. Unfreeze the filesystem with the unfreeze ioctl. 4. Take the backup from the separated replication volume or the snapshot. This patch: VFS: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that they can return an error. Rename write_super_lockfs and unlockfs of the super block operation freeze_fs and unfreeze_fs to avoid a confusion. ext3, ext4, xfs, gfs2, jfs: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that write_super_lockfs returns an error if needed, and unlockfs always returns 0. reiserfs: Changed the type of write_super_lockfs and unlockfs from "void" to "int" so that they always return 0 (success) to keep a current behavior. Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com> Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-ext4@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-09 16:54:42 -08:00
David Brownell	2d96d1053d	CORE_DUMP_DEFAULT_ELF_HEADERS depends on ELF_CORE Kernels that don't support ELF coredumps at all surely can't be supporting new partial-segment flavored ELF coredumps ... don't make folk answer Kconfig questions about that flavor. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-09 16:54:41 -08:00
Linus Torvalds	9a100a4464	Merge git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async-2 * git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async-2: async: make async a command line option for now partial revert of asynchronous inode delete	2009-01-09 15:32:26 -08:00
Linus Torvalds	32b838b8cf	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: [JFFS2] remove junk prototypes	2009-01-09 15:29:04 -08:00
Linus Torvalds	31aeb6c815	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus * git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus: MAINTAINERS: squashfs entry Squashfs: documentation Squashfs: initrd support Squashfs: Kconfig entry Squashfs: Makefiles Squashfs: header files Squashfs: block operations Squashfs: cache operations Squashfs: uid/gid lookup operations Squashfs: fragment block operations Squashfs: export operations Squashfs: super block operations Squashfs: symlink operations Squashfs: regular file operations Squashfs: directory readdir operations Squashfs: directory lookup operations Squashfs: inode operations	2009-01-09 15:18:49 -08:00
Linus Torvalds	c40f6f8bbc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu * git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu: NOMMU: Support XIP on initramfs NOMMU: Teach kobjsize() about VMA regions. FLAT: Don't attempt to expand the userspace stack to fill the space allocated FDPIC: Don't attempt to expand the userspace stack to fill the space allocated NOMMU: Improve procfs output using per-MM VMAs NOMMU: Make mmap allocation page trimming behaviour configurable. NOMMU: Make VMAs per MM as for MMU-mode linux NOMMU: Delete askedalloc and realalloc variables NOMMU: Rename ARM's struct vm_region NOMMU: Fix cleanup handling in ramfs_nommu_get_umapped_area()	2009-01-09 14:00:58 -08:00
Arjan van de Ven	b32714ba29	partial revert of asynchronous inode delete let the core of this one bake in -next as well, but leave some of the infrastructure in place. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2009-01-09 13:15:49 -08:00
Artem Bityutskiy	ab5610b434	[JFFS2] remove junk prototypes 'rb_prev()', 'rb_next()' and 'rb_replace_node()' are declared in include/linux/rbtree.h, no need for JFFS2 to re-declare them. I believe these are left-overs from the old days when the common RB tree code did not have those call and JFFS2 had private implementation. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-01-09 21:05:21 +00:00
Linus Torvalds	73d59314e6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (864 commits) Btrfs: explicitly mark the tree log root for writeback Btrfs: Drop the hardware crc32c asm code Btrfs: Add Documentation/filesystem/btrfs.txt, remove old COPYING Btrfs: kmap_atomic(KM_USER0) is safe for btrfs_readpage_end_io_hook Btrfs: Don't use kmap_atomic(..., KM_IRQ0) during checksum verifies Btrfs: tree logging checksum fixes Btrfs: don't change file extent's ram_bytes in btrfs_drop_extents Btrfs: Use btrfs_join_transaction to avoid deadlocks during snapshot creation Btrfs: drop remaining LINUX_KERNEL_VERSION checks and compat code Btrfs: drop EXPORT symbols from extent_io.c Btrfs: Fix checkpatch.pl warnings Btrfs: Fix free block discard calls down to the block layer Btrfs: avoid orphan inode caused by log replay Btrfs: avoid potential super block corruption Btrfs: do not call kfree if kmalloc failed in btrfs_sysfs_add_super Btrfs: fix a memory leak in btrfs_get_sb Btrfs: Fix typo in clear_state_cb Btrfs: Fix memset length in btrfs_file_write Btrfs: update directory's size when creating subvol/snapshot Btrfs: add permission checks to the ioctls ...	2009-01-09 13:01:38 -08:00
Linus Torvalds	6ddaab20c3	Merge branch 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block: block: fix bug in ptbl lookup cache	2009-01-09 12:57:34 -08:00
Neil Brown	54b0d12769	block: fix bug in ptbl lookup cache Neil writes: Hi Jens, I've found a little bug for you. It was introduced by `a6f23657d3` block: add one-hit cache for disk partition lookup and has the effect of killing my machine whenever I try to assemble an md array :-( One of the devices in the array has partitions, and mdadm always deletes partitions before putting a whole-device in an array (as it can cause confusion). The next IO to that device locks the machine. I don't really understand exactly why it locks up, but it happens in disk_map_sector_rcu(). This patch fixes it. Which is due to a missing clear of the (now) stale partition lookup data. So clear that when we delete a partition. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-01-09 21:46:13 +01:00
Linus Torvalds	7c51d57e9d	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: (67 commits) [MTD] [MAPS] Fix printk format warning in nettel.c [MTD] [NAND] add cmdline parsing (mtdparts=) support to cafe_nand [MTD] CFI: remove major/minor version check for command set 0x0002 [MTD] [NAND] ndfc driver [MTD] [TESTS] Fix some size_t printk format warnings [MTD] LPDDR Makefile and KConfig [MTD] LPDDR extended physmap driver to support LPDDR flash [MTD] LPDDR added new pfow_base parameter [MTD] LPDDR Command set driver [MTD] LPDDR PFOW definition [MTD] LPDDR QINFO records definitions [MTD] LPDDR qinfo probing. [MTD] [NAND] pxa3xx: convert from ns to clock ticks more accurately [MTD] [NAND] pxa3xx: fix non-page-aligned reads [MTD] [NAND] fix nandsim sched.h references [MTD] [NAND] alauda: use USB API functions rather than constants [MTD] struct device - replace bus_id with dev_name(), dev_set_name() [MTD] fix m25p80 64-bit divisions [MTD] fix dataflash 64-bit divisions [MTD] [NAND] Set the fsl elbc ECCM according the settings in bootloader. ... Fixed up trivial debug conflicts in drivers/mtd/devices/{m25p80.c,mtd_dataflash.c}	2009-01-09 12:37:15 -08:00
Chris Mason	e293e97e36	Btrfs: explicitly mark the tree log root for writeback Each subvolume has an extent_state_tree used to mark metadata that needs to be sent to disk while syncing the tree. This is used in addition to the dirty bits on the pages themselves so that a single subvolume can be sent to disk efficiently in disk order. Normally this marking happens in btrfs_alloc_free_block, which also does special recording of dirty tree blocks for the tree log roots. Yan Zheng noticed that when the root of the log tree is allocated, it is added to the wrong writeback list. The fix used here is to explicitly set it dirty as part of tree log creation. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-01-09 13:14:17 -05:00
Nick Piggin	0087167c9d	[XFS] use scalable vmap API Implement XFS's large buffer support with the new vmap APIs. See the vmap rewrite (`db64fe02`) for some numbers. The biggest improvement that comes from using the new APIs is avoiding the global KVA allocation lock on every call. Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2009-01-09 17:09:47 +11:00
Nick Piggin	958f8c0e4f	[XFS] remove old vmap cache XFS's vmap batching simply defers a number (up to 64) of vunmaps, and keeps track of them in a list. To purge the batch, it just goes through the list and calls vunamp on each one. This is pretty poor: a global TLB flush is generally still performed on each vunmap, with the most expensive parts of the operation being the broadcast IPIs and locking involved in the SMP callouts, and the locking involved in the vmap management -- none of these are avoided by just batching up the calls. I'm actually surprised it ever made much difference. (Now that the lazy vmap allocator is upstream, this description is not quite right, but the vunmap batching still doesn't seem to do much) Rip all this logic out of XFS completely. I will improve vmap performance and scalability directly in subsequent patch. Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2009-01-09 17:09:25 +11:00
Christoph Hellwig	058652a37d	[XFS] make xfs_ino_t an unsigned long long Currently xfs_ino_t is defined as a u64 which can either be an unsigned long long or on some 64 bit platforms and unsigned long. Just making it and unsigned long long mean's it's still always 64 bits wide, but we don't need to resort to cases to print it. Fixes a warning regression on 64 bit powerpc in current git. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2009-01-09 16:19:14 +11:00
Christoph Hellwig	1544031976	[XFS] truncate readdir offsets to signed 32 bit values John Stanley reported EOVERFLOW errors in readdir from his self-build glibc. I traced this down to glibc enabling d_off overflow checks in one of the about five million different getdents implementations. In 2.6.28 Dave Woodhouse moved our readdir double buffering required for NFS4 readdirplus into nfsd and at that point we lost the capping of the directory offsets to 32 bit signed values. Johns glibc used getdents64 to even implement readdir for normal 32 bit offset dirents, and failed with EOVERFLOW only if this happens on the first dirent in a getdents call. I managed to come up with a testcase that uses raw getdents and does the EOVERFLOW check manually. We always hit it with our last entry due to the special end of directory marker. The patch below is a dumb version of just putting back the masking, to make sure we have the same behavior as in 2.6.27 and earlier. I will work on a better and cleaner fix for 2.6.30. Reported-by: John Stanley <jpsinthemix@verizon.net> Tested-by: John Stanley <jpsinthemix@verizon.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2009-01-09 16:18:24 +11:00
Christoph Hellwig	e6edbd1c1c	[XFS] fix compile of xfs_btree_readahead_lblock on m68k Change the left/right variables to the proper always 64bit xfs_dfsbo_t type because otherwise compilation fails for Geert on m68k without CONFIG_LBD: \| fs/xfs/xfs_btree.c: In function 'xfs_btree_readahead_lblock': \| fs/xfs/xfs_btree.c:736: warning: comparison is always true due to limited range of data type \| fs/xfs/xfs_btree.c:741: warning: comparison is always true due to limited range of data type Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2009-01-09 16:16:51 +11:00
Eric Sandeen	fb82557f16	[XFS] Remove macro-to-function indirections in the mask code Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2009-01-09 15:53:54 +11:00
Eric Sandeen	c9fb86a917	[XFS] Remove macro-to-function indirections in attr code Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2009-01-09 15:46:44 +11:00
Eric Sandeen	9800b55035	[XFS] Remove several unused typedefs. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2009-01-09 15:46:16 +11:00
Christoph Hellwig	c9a98553d5	[XFS] pass XFS_IGET_BULKSTAT to xfs_iget for handle operations NFS clients or users of the handle ioctls can pass us arbitrary inode numbers through the exportfs interface. Make sure we use the XFS_IGET_BULKSTAT so that these don't cause shutdowns due to the corruption checks. Also translate the EINVAL we get back for invalid inode clusters into an ESTALE which is more appropinquate, and remove the useless check for a NULL inode on a successfull xfs_iget return. I have a testcase to reproduce this using the handle interface which I will submit to xfsqa. Reported-by: Mario Becroft <mb@gem.win.co.nz> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2009-01-09 15:17:17 +11:00
Linus Torvalds	2150edc6c5	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (57 commits) jbd2: Fix oops in jbd2_journal_init_inode() on corrupted fs ext4: Remove "extents" mount option block: Add Kconfig help which notes that ext4 needs CONFIG_LBD ext4: Make printk's consistently prefixed with "EXT4-fs: " ext4: Add sanity checks for the superblock before mounting the filesystem ext4: Add mount option to set kjournald's I/O priority jbd2: Submit writes to the journal using WRITE_SYNC jbd2: Add pid and journal device name to the "kjournald2 starting" message ext4: Add markers for better debuggability ext4: Remove code to create the journal inode ext4: provide function to release metadata pages under memory pressure ext3: provide function to release metadata pages under memory pressure add releasepage hooks to block devices which can be used by file systems ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc ext4: Init the complete page while building buddy cache ext4: Don't allow new groups to be added during block allocation ext4: mark the blocks/inode bitmap beyond end of group as used ext4: Use new buffer_head flag to check uninit group bitmaps initialization ext4: Fix the race between read_inode_bitmap() and ext4_new_inode() ext4: code cleanup ...	2009-01-08 17:14:59 -08:00
Linus Torvalds	cd764695b6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (45 commits) [SCSI] qla2xxx: Update version number to 8.03.00-k1. [SCSI] qla2xxx: Add ISP81XX support. [SCSI] qla2xxx: Use proper request/response queues with MQ instantiations. [SCSI] qla2xxx: Correct MQ-chain information retrieval during a firmware dump. [SCSI] qla2xxx: Collapse EFT/FCE copy procedures during a firmware dump. [SCSI] qla2xxx: Don't pollute kernel logs with ZIO/RIO status messages. [SCSI] qla2xxx: Don't fallback to interrupt-polling during re-initialization with MSI-X enabled. [SCSI] qla2xxx: Remove support for reading/writing HW-event-log. [SCSI] cxgb3i: add missing include [SCSI] scsi_lib: fix DID_RESET status problems [SCSI] fc transport: restore missing dev_loss_tmo callback to LLDD [SCSI] aha152x_cs: Fix regression that keeps driver from using shared interrupts [SCSI] sd: Correctly handle 6-byte commands with DIX [SCSI] sd: DIF: Fix tagging on platforms with signed char [SCSI] sd: DIF: Show app tag on error [SCSI] Fix error handling for DIF/DIX [SCSI] scsi_lib: don't decrement busy counters when inserting commands [SCSI] libsas: fix test for negative unsigned and typos [SCSI] a2091, gvp11: kill warn_unused_result warnings [SCSI] fusion: Move a dereference below a NULL test ... Fixed up trivial conflict due to moving the async part of sd_probe around in the async probes vs using dev_set_name() in naming.	2009-01-08 16:27:31 -08:00
Linus Torvalds	894bcdfb1a	Merge branch 'for-linus' of git://neil.brown.name/md * 'for-linus' of git://neil.brown.name/md: md: don't retry recovery of raid1 that fails due to error on source drive. md: Allow md devices to be created by name. md: make devices disappear when they are no longer needed. md: centralise all freeing of an 'mddev' in 'md_free' md: move allocation of ->queue from mddev_find to md_probe md: need another print_sb for mdp_superblock_1 md: use list_for_each_entry macro directly md: raid0: make hash_spacing and preshift sector-based. md: raid0: Represent the size of strip zones in sectors. md: raid0 create_strip_zones(): Add KERN_INFO/KERN_ERR to printk's. md: raid0 create_strip_zones(): Make two local variables sector-based. md: raid0: Represent zone->zone_offset in sectors. md: raid0: Represent device offset in sectors. md: raid0_make_request(): Replace local variable block by sector. md: raid0_make_request(): Remove local variable chunk_size. md: raid0_make_request(): Replace chunksize_bits by chunksect_bits. md: use sysfs_notify_dirent to notify changes to md/sync_action. md: fix bitmap-on-external-file bug.	2009-01-08 14:03:34 -08:00
NeilBrown	d3374825ce	md: make devices disappear when they are no longer needed. Currently md devices, once created, never disappear until the module is unloaded. This is essentially because the gendisk holds a reference to the mddev, and the mddev holds a reference to the gendisk, this a circular reference. If we drop the reference from mddev to gendisk, then we need to ensure that the mddev is destroyed when the gendisk is destroyed. However it is not possible to hook into the gendisk destruction process to enable this. So we drop the reference from the gendisk to the mddev and destroy the gendisk when the mddev gets destroyed. However this has a complication. Between the call __blkdev_get->get_gendisk->kobj_lookup->md_probe and the call __blkdev_get->md_open there is no obvious way to hold a reference on the mddev any more, so unless something is done, it will disappear and gendisk will be destroyed prematurely. Also, once we decide to destroy the mddev, there will be an unlockable moment before the gendisk is unlinked (blk_unregister_region) during which a new reference to the gendisk can be created. We need to ensure that this reference can not be used. i.e. the ->open must fail. So: 1/ in md_probe we set a flag in the mddev (hold_active) which indicates that the array should be treated as active, even though there are no references, and no appearance of activity. This is cleared by md_release when the device is closed if it is no longer needed. This ensures that the gendisk will survive between md_probe and md_open. 2/ In md_open we check if the mddev we expect to open matches the gendisk that we did open. If there is a mismatch we return -ERESTARTSYS and modify __blkdev_get to retry from the top in that case. In the -ERESTARTSYS sys case we make sure to wait until the old gendisk (that we succeeded in opening) is really gone so we loop at most once. Some udev configurations will always open an md device when it first appears. If we allow an md device that was just created by an open to disappear on an immediate close, then this can race with such udev configurations and result in an infinite loop the device being opened and closed, then re-open due to the 'ADD' even from the first open, and then close and so on. So we make sure an md device, once created by an open, remains active at least until some md 'ioctl' has been made on it. This means that all normal usage of md devices will allow them to disappear promptly when not needed, but the worst that an incorrect usage will do it cause an inactive md device to be left in existence (it can easily be removed). As an array can be stopped by writing to a sysfs attribute echo clear > /sys/block/mdXXX/md/array_state we need to use scheduled work for deleting the gendisk and other kobjects. This allows us to wait for any pending gendisk deletion to complete by simply calling flush_scheduled_work(). Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:10 +11:00

1 2 3 4 5 ...

12631 Commits (b7a9f29fcf4e53e9ca7982331649fa2013e69c99)