linux

Commit Graph

Author	SHA1	Message	Date
Sachin Kamat	cf0e3a64ca	f2fs: remove unneeded version.h header file from f2fs.h Including <linux/version.h> is not necessary. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>	2012-12-11 13:43:42 +09:00
Jaegeuk Kim	a14d53937c	f2fs: update Kconfig and Makefile This adds Makefile and Kconfig for f2fs, and updates Makefile and Kconfig files in the fs directory. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:42 +09:00
Greg Kroah-Hartman	902829aa0b	f2fs: move proc files to debugfs This moves all of the f2fs debugging files into debugfs. The files are located in /sys/kernel/debug/f2fs/ Note, I think we are generating all of the same information in each of the files for every unique f2fs filesystem in the machine. This copies the functionality that was present in the proc files, but this should be fixed up in the future. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [jaegeuk.kim@samsung.com: merged 3 debugfs entries into a status entry] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:42 +09:00
Jaegeuk Kim	d624c96fb3	f2fs: add recovery routines for roll-forward This adds roll-forward routines to recover fsynced data. - F2FS uses basically roll-back model with checkpointing. - In order to implement fsync(), there are two approaches as follows. 1. A roll-back model with checkpointing at every fsync() : This is a naive method, but suffers from very low performance. 2. A roll-forward model : F2FS adopts this model where all the fsynced data should be recovered, which were written after checkpointing was done. In order to figure out the data, F2FS keeps a "fsync" mark in direct node blocks. In addition, F2FS remains the location of next node block in each direct node block for reconstructing the chain of node blocks during the recovery. - In order to enhance the performance, F2FS keeps a "dentry" mark also in direct node blocks. If this is set during the recovery, F2FS replays adding a dentry. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:42 +09:00
Jaegeuk Kim	7bc0900347	f2fs: add garbage collection functions This adds on-demand and background cleaning functions. - The basic background cleaning policy is trying to do cleaning jobs as much as possible whenever the system is idle. Once the background cleaning is done, the cleaner sleeps an amount of time not to interfere with VFS calls. The time is dynamically adjusted according to the status of whole segments, which is decreased when the following conditions are satisfied. . GC is not conducted currently, and . IO subsystem is idle by checking the number of requets in bdev's request list, and . There are enough dirty segments. Otherwise, the time is increased incrementally until to the maximum time. Note that, min and max times are 10 secs and 30 secs by default. - F2FS adopts a default victim selection policy where background cleaning uses a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm. - The method of moving data during the cleaning is slightly different between background and on-demand cleaning schemes. In the case of background cleaning, F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should move the data right away. - In order to identify valid blocks in a victim segment, F2FS scans the bitmap of the segment managed as an SIT entry. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:41 +09:00
Jaegeuk Kim	af48b85b8c	f2fs: add xattr and acl functionalities This implements xattr and acl functionalities. - F2FS uses a node page to contain use extended attributes. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:41 +09:00
Jaegeuk Kim	6b4ea0160a	f2fs: add core directory operations this adds core functions to find, add, delete, and link dentries. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:41 +09:00
Jaegeuk Kim	57397d86c6	f2fs: add inode operations for special inodes This adds inode operations for directory, symlink, and special inodes. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:41 +09:00
Jaegeuk Kim	19f99cee20	f2fs: add core inode operations This adds core functions to get, read, write, and evict an inode. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:41 +09:00
Jaegeuk Kim	eb47b8009d	f2fs: add address space operations for data This adds address space operations for data. - F2FS supports readpages(), writepages(), and direct_IO(). - Because of out-of-place writes, f2fs_direct_IO() does not write data in place. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:41 +09:00
Jaegeuk Kim	fbfa2cc58d	f2fs: add file operations This adds memory operations and file/file_inode operations. - F2FS supports fallocate(), mmap(), fsync(), and basic ioctl(). Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:41 +09:00
Jaegeuk Kim	351df4b201	f2fs: add segment operations This adds specific functions not only to manage dirty/free segments, SIT pages, a cache for SIT entries, and summary entries, but also to allocate free blocks and write three types of pages: data, node, and meta. - F2FS maintains three types of bitmaps in memory, which indicate free, prefree, and dirty segments respectively. - The key information of an SIT entry consists of a segment number, the number of valid blocks in the segment, a bitmap to identify there-in valid or invalid blocks. - An SIT page is composed of a certain range of SIT entries, which is maintained by the address space of meta_inode. - To cache SIT entries, a simple array is used. The index for the array is the segment number. - A summary entry for data contains the parent node information. A summary entry for node contains its node offset from the inode. - F2FS manages information about six active logs and those summary entries in memory. Whenever one of them is changed, its summary entries are flushed to its SIT page maintained by the address space of meta_inode. - This patch adds a default block allocation function which supports heap-based allocation policy. - This patch adds core functions to write data, node, and meta pages. Since LFS basically produces a series of sequential writes, F2FS merges sequential bios with a single one as much as possible to reduce the IO scheduling overhead. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:40 +09:00
Jaegeuk Kim	e05df3b115	f2fs: add node operations This adds specific functions to manage NAT pages, a cache for NAT entries, free nids, direct/indirect node blocks for indexing data, and address space for node pages. - The key information of an NAT entry consists of a node id and a block address. - An NAT page is composed of block addresses covered by a certain range of NAT entries, which is maintained by the address space of meta_inode. - A radix tree structure is used to cache NAT entries. The index for the tree is a node id. - When there is no free nid, F2FS should scan NAT entries to find new one. In order to avoid scanning frequently, F2FS manages a list containing a number of free nids in memory. Only when free nids in the list are exhausted, scanning process, build_free_nids(), is triggered. - F2FS has direct and indirect node blocks for indexing data. This patch adds fuctions related to the node block management such as getting, allocating, and truncating node blocks to index data. - In order to cache node blocks in memory, F2FS has a node_inode with an address space for node pages. This patch also adds the address space operations for node_inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:40 +09:00
Jaegeuk Kim	127e670abf	f2fs: add checkpoint operations This adds functions required by the checkpoint operations. Basically, f2fs adopts a roll-back model with checkpoint blocks written in the CP area. The checkpoint procedure includes as follows. - write_checkpoint() 1. block_operations() freezes VFS calls. 2. submit cached bios. 3. flush_nat_entries() writes NAT pages updated by dirty NAT entries. 4. flush_sit_entries() writes SIT pages updated by dirty SIT entries. 5. do_checkpoint() writes, - checkpoint block (#0) - orphan inode blocks - summary blocks made by active logs - checkpoint block (copy of #0) 6. unblock_opeations() In order to provide an address space for meta pages, f2fs_sb_info has a special inode, namely meta_inode. This patch also adds the address space operations for meta_inode. Signed-off-by: Chul Lee <chur.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:40 +09:00
Jaegeuk Kim	aff063e266	f2fs: add super block operations This adds the implementation of superblock operations for f2fs, which includes - init_f2fs_fs/exit_f2fs_fs - f2fs_mount - super_operations of f2fs Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:40 +09:00
Jaegeuk Kim	39a53e0ce0	f2fs: add superblock and major in-memory structure This adds the following major in-memory structures in f2fs. - f2fs_sb_info: contains f2fs-specific information, two special inode pointers for node and meta address spaces, and orphan inode management. - f2fs_inode_info: contains vfs_inode and other fs-specific information. - f2fs_nm_info: contains node manager information such as NAT entry cache, free nid list, and NAT page management. - f2fs_node_info: represents a node as node id, inode number, block address, and its version. - f2fs_sm_info: contains segment manager information such as SIT entry cache, free segment map, current active logs, dirty segment management, and segment utilization. The specific structures are sit_info, free_segmap_info, dirty_seglist_info, curseg_info. In addition, add F2FS_SUPER_MAGIC in magic.h. Signed-off-by: Chul Lee <chur.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:40 +09:00
Linus Torvalds	684c9aaebb	vfs: fix O_DIRECT read past end of block device The direct-IO write path already had the i_size checks in mm/filemap.c, but it turns out the read path did not, and removing the block size checks in fs/block_dev.c (commit bbec0270bdd8: "blkdev_max_block: make private to fs/buffer.c") removed the magic "shrink IO to past the end of the device" code there. Fix it by truncating the IO to the size of the block device, like the write path already does. NOTE! I suspect the write path would be much better off doing it this way in fs/block_dev.c, rather than hidden deep in mm/filemap.c. The mm/filemap.c code is extremely hard to follow, and has various conditionals on the target being a block device (ie the flag passed in to 'generic_write_checks()', along with a conditional update of the inode timestamp etc). It is also quite possible that we should treat this whole block device size as a "s_maxbytes" issue, and try to make the logic even more generic. However, in the meantime this is the fairly minimal targeted fix. Noted by Milan Broz thanks to a regression test for the cryptsetup reencrypt tool. Reported-and-tested-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-08 08:28:26 -08:00
Dan Carpenter	27d7c2a006	vfs: clear to the end of the buffer on partial buffer reads READ is zero so the "rw & READ" test is always false. The intended test was "((rw & RW_MASK) == READ)". Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-05 10:32:59 -08:00
Linus Torvalds	57302e0ddf	vfs: avoid "attempt to access beyond end of device" warnings The block device access simplification that avoided accessing the (racy) block size information (commit bbec0270bdd8: "blkdev_max_block: make private to fs/buffer.c") no longer checks the maximum block size in the block mapping path. That was _almost_ as simple as just removing the code entirely, because the readers and writers all check the size of the device anyway, so under normal circumstances it "just worked". However, the block size may be such that the end of the device may straddle one single buffer_head. At which point we may still want to access the end of the device, but the buffer we use to access it partially extends past the end. The 'bd_set_size()' function intentionally sets the block size to avoid this, but mounting the device - or setting the block size by hand to some other value - can modify that block size. So instead, teach 'submit_bh()' about the special case of the buffer head straddling the end of the device, and turning such an access into a smaller IO access, avoiding the problem. This, btw, also means that unlike before, we can now access the whole device regardless of device block size setting. So now, even if the device size is only 512-byte aligned, we can read and write even the last sector even when having a much bigger block size for accessing the rest of the device. So with this, we could now get rid of the 'bd_set_size()' block size code entirely - resulting in faster IO for the common case - but that would be a separate patch. Reported-and-tested-by: Romain Francoise <romain@orebokech.com> Reporeted-and-tested-by: Meelis Roos <mroos@linux.ee> Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-04 08:25:11 -08:00
Linus Torvalds	d3594ea2b3	Merge branch 'block-dev' Merge 'block-dev' branch. I was going to just mark everything here for stable and leave it to the 3.8 merge window, but having decided on doing another -rc, I migth as well merge it now. This removes the bd_block_size_semaphore semaphore that was added in this release to fix a race condition between block size changes and block IO, and replaces it with atomicity guaratees in fs/buffer.c instead, along with simplifying fs/block-dev.c. This removes more lines than it adds, makes the code generally simpler, and avoids the latency/rt issues that the block size semaphore introduced for mount. I'm not happy with the timing, but it wouldn't be much better doing this during the merge window and then having some delayed back-port of it into stable. * block-dev: blkdev_max_block: make private to fs/buffer.c direct-io: don't read inode->i_blkbits multiple times blockdev: remove bd_block_size_semaphore again fs/buffer.c: make block-size be per-page and protected by the page lock	2012-12-03 10:53:25 -08:00
Linus Torvalds	331fee3cd3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A bunch of fixes; the last one is this cycle regression, the rest are -stable fodder." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix off-by-one in argument passed by iterate_fd() to callbacks lookup_one_len: don't accept . and .. cifs: get rid of blind d_drop() in readdir nfs_lookup_revalidate(): fix a leak don't do blind d_drop() in nfs_prime_dcache()	2012-12-01 13:29:55 -08:00
Linus Torvalds	086486e46e	Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6 Pull CIFS fixes from Steve French: "Two low risk, small fixes, that fix cifs regressions introduced in 3.7." * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: CIFS: Fix wrong buffer pointer usage in smb_set_file_info cifs: fix writeback race with file that is growing	2012-11-30 16:57:18 -08:00
Al Viro	a77cfcb429	fix off-by-one in argument passed by iterate_fd() to callbacks Noticed by Pavel Roskin; the thing in his patch I disagree with was compensating for that shite in callbacks instead of fixing it once in the iterator itself. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-11-29 23:01:30 -05:00
Al Viro	21d8a15ac3	lookup_one_len: don't accept . and .. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-11-29 22:17:21 -05:00
Al Viro	0903a0c849	cifs: get rid of blind d_drop() in readdir Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-11-29 22:11:06 -05:00
Al Viro	c44600c9d1	nfs_lookup_revalidate(): fix a leak We are leaking fattr and fhandle if we decide that dentry is not to be invalidated, after all (e.g. happens to be a mountpoint). Just free both before that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-11-29 22:04:36 -05:00
Al Viro	696199f8cc	don't do blind d_drop() in nfs_prime_dcache() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-11-29 22:00:51 -05:00
Linus Torvalds	bbec0270bd	blkdev_max_block: make private to fs/buffer.c We really don't want to look at the block size for the raw block device accesses in fs/block-dev.c, because it may be changing from under us. So get rid of the max_block logic entirely, since the caller should already have done it anyway. That leaves the only user of this function in fs/buffer.c, so move the whole function there and make it static. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-11-29 17:48:12 -08:00
Linus Torvalds	ab73857e35	direct-io: don't read inode->i_blkbits multiple times Since directio can work on a raw block device, and the block size of the device can change under it, we need to do the same thing that fs/buffer.c now does: read the block size a single time, using ACCESS_ONCE(). Reading it multiple times can get different results, which will then confuse the code because it actually encodes the i_blksize in relationship to the underlying logical blocksize. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-11-29 12:38:44 -08:00
Linus Torvalds	1e8b33328a	blockdev: remove bd_block_size_semaphore again This reverts the block-device direct access code to the previous unlocked code, now that fs/buffer.c no longer needs external locking. With this, fs/block_dev.c is back to the original version, apart from a whitespace cleanup that I didn't want to revert. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-11-29 10:52:19 -08:00
Linus Torvalds	45bce8f3e3	fs/buffer.c: make block-size be per-page and protected by the page lock This makes the buffer size handling be a per-page thing, which allows us to not have to worry about locking too much when changing the buffer size. If a page doesn't have buffers, we still need to read the block size from the inode, but we can do that with ACCESS_ONCE(), so that even if the size is changing, we get a consistent value. This doesn't convert all functions - many of the buffer functions are used purely by filesystems, which in turn results in the buffer size being fixed at mount-time. So they don't have the same consistency issues that the raw device access can have. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-11-29 10:47:20 -08:00
Pavel Shilovsky	c772aa92b6	CIFS: Fix wrong buffer pointer usage in smb_set_file_info Commit `6bdf6dbd66` caused a regression in setattr codepath that leads to files with wrong attributes. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>	2012-11-28 10:02:46 -06:00
Jeff Layton	3a98b86143	cifs: fix writeback race with file that is growing Commit `eddb079deb` created a regression in the writepages codepath. Previously, whenever it needed to check the size of the file, it did so by consulting the inode->i_size field directly. With that patch, the i_size was fetched once on entry into the writepages code and that value was used henceforth. If the file is changing size though (for instance, if someone is writing to it or has truncated it), then that value is likely to be wrong. This can lead to data corruption. Pages past the EOF at the time that the writepages call was issued may be silently dropped and ignored because cifs_writepages wrongly assumes that the file must have been truncated in the interim. Fix cifs_writepages to properly fetch the size from the inode->i_size field instead to properly account for this possibility. Original bug report is here: https://bugzilla.kernel.org/show_bug.cgi?id=50991 Reported-and-Tested-by: Maxim Britov <ungifted01@gmail.com> Reviewed-by: Suresh Jayaraman <sjayaraman@suse.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>	2012-11-27 13:46:12 -06:00
Linus Torvalds	2844a48706	Merge branch 'akpm' (Fixes from Andrew) Merge misc fixes from Andrew Morton: "8 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (8 patches) futex: avoid wake_futex() for a PI futex_q watchdog: using u64 in get_sample_period() writeback: put unused inodes to LRU after writeback completion mm: vmscan: check for fatal signals iff the process was throttled Revert "mm: remove __GFP_NO_KSWAPD" proc: check vma->vm_file before dereferencing UAPI: strip the _UAPI prefix from header guards during header installation include/linux/bug.h: fix sparse warning related to BUILD_BUG_ON_INVALID	2012-11-26 18:33:33 -08:00
Linus Torvalds	87726c334b	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext3 regression fix from Jan Kara: "Fix an ext3 regression introduced during 3.7 merge window. It leads to deadlock if you stress the filesystem in the right way (luckily only if blocksize < pagesize)." * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: jbd: Fix lock ordering bug in journal_unmap_buffer()	2012-11-26 17:42:07 -08:00
Jan Kara	4eff96dd52	writeback: put unused inodes to LRU after writeback completion Commit `169ebd9013` ("writeback: Avoid iput() from flusher thread") removed iget-iput pair from inode writeback. As a side effect, inodes that are dirty during iput_final() call won't be ever added to inode LRU (iput_final() doesn't add dirty inodes to LRU and later when the inode is cleaned there's noone to add the inode there). Thus inodes are effectively unreclaimable until someone looks them up again. The practical effect of this bug is limited by the fact that inodes are pinned by a dentry for long enough that the inode gets cleaned. But still the bug can have nasty consequences leading up to OOM conditions under certain circumstances. Following can easily reproduce the problem: for (( i = 0; i < 1000; i++ )); do mkdir $i for (( j = 0; j < 1000; j++ )); do touch $i/$j echo 2 > /proc/sys/vm/drop_caches done done then one needs to run 'sync; ls -lR' to make inodes reclaimable again. We fix the issue by inserting unused clean inodes into the LRU after writeback finishes in inode_sync_complete(). Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: <stable@vger.kernel.org> [3.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-11-26 17:41:24 -08:00
Stanislav Kinsbursky	05f564849d	proc: check vma->vm_file before dereferencing Commit `7b540d0646` ("proc_map_files_readdir(): don't bother with grabbing files") switched proc_map_files_readdir() to use @f_mode directly instead of grabbing @file reference, but same time the test for @vm_file presence was lost leading to nil dereference. The patch brings the test back. The all proc_map_files feature is CONFIG_CHECKPOINT_RESTORE wrapped (which is set to 'n' by default) so the bug doesn't affect regular kernels. The regression is 3.7-rc1 only as far as I can tell. [gorcunov@openvz.org: provided changelog] Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-11-26 17:41:24 -08:00
Linus Torvalds	35f95d228e	Most important part of this is that it fixes a regression in Samsung NAND chip detection, introduced by some rework which went into 3.7. The initial fix wasn't quite complete, so it's in two parts. In fact the first part is committed twice (Artem committed his own copy of the same patch) and I've merged Artem's tree into mine which already had that fix. I'd have recommitted that to make it somewhat cleaner, but figured by this point in the release cycle it was better to merge exactly the commits which have been in linux-next. If I'd recommitted, I'd also omit the sparse warning fix. But it's there, and it's harmless — just marking one function as 'static' in onenand code. This also includes a couple more fixes for stable: an AB-BA deadlock in JFFS2, and an invalid range check in slram. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEABECAAYFAlCwEIsACgkQdwG7hYl686NfZgCfSYFA2q8yp7jEMdDaxpFPuuDm FFMAoI3V27BpWxRab6GylYh8erHp9ful =Wo+T -----END PGP SIGNATURE----- Merge tag 'for-linus-20121123' of git://git.infradead.org/mtd-2.6 Pull MTD fixes from David Woodhouse: "The most important part of this is that it fixes a regression in Samsung NAND chip detection, introduced by some rework which went into 3.7. The initial fix wasn't quite complete, so it's in two parts. In fact the first part is committed twice (Artem committed his own copy of the same patch) and I've merged Artem's tree into mine which already had that fix. I'd have recommitted that to make it somewhat cleaner, but figured by this point in the release cycle it was better to merge exactly the commits which have been in linux-next. If I'd recommitted, I'd also omit the sparse warning fix. But it's there, and it's harmless — just marking one function as 'static' in onenand code. This also includes a couple more fixes for stable: an AB-BA deadlock in JFFS2, and an invalid range check in slram." * tag 'for-linus-20121123' of git://git.infradead.org/mtd-2.6: mtd: nand: fix Samsung SLC detection regression mtd: nand: fix Samsung SLC NAND identification regression jffs2: Fix lock acquisition order bug in jffs2_write_begin mtd: onenand: Make flexonenand_set_boundary static mtd: slram: invalid checking of absolute end address mtd: ofpart: Fix incorrect NULL check in parse_ofoldpart_partitions() mtd: nand: fix Samsung SLC NAND identification regression	2012-11-23 15:12:17 -10:00
Jan Kara	25389bb207	jbd: Fix lock ordering bug in journal_unmap_buffer() Commit `09e05d48` introduced a wait for transaction commit into journal_unmap_buffer() in the case we are truncating a buffer undergoing commit in the page stradding i_size on a filesystem with blocksize < pagesize. Sadly we forgot to drop buffer lock before waiting for transaction commit and thus deadlock is possible when kjournald wants to lock the buffer. Fix the problem by dropping the buffer lock before waiting for transaction commit. Since we are still holding page lock (and that is OK), buffer cannot disappear under us. CC: stable@vger.kernel.org # Wherever commit `09e05d48` was taken Signed-off-by: Jan Kara <jack@suse.cz>	2012-11-23 15:17:18 +01:00
Linus Torvalds	ca6215dfc7	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull reiserfs and ext3 fixes from Jan Kara: "Fixes of reiserfs deadlocks when quotas are enabled (locking there was completely busted by BKL conversion) and also one small ext3 fix in the trim interface." * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ext3: Avoid underflow of in ext3_trim_fs() reiserfs: Move quota calls out of write lock reiserfs: Protect reiserfs_quota_write() with write lock reiserfs: Protect reiserfs_quota_on() with write lock reiserfs: Fix lock ordering during remount	2012-11-20 18:48:25 -10:00
Lukas Czerner	ae49eeec78	ext3: Avoid underflow of in ext3_trim_fs() Currently if len argument in ext3_trim_fs() is smaller than one block, the 'end' variable underflow. Avoid that by returning EINVAL if len is smaller than file system block. Also remove useless unlikely(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>	2012-11-19 21:36:12 +01:00
Jan Kara	7af1168693	reiserfs: Move quota calls out of write lock Calls into highlevel quota code cannot happen under the write lock. These calls take dqio_mutex which ranks above write lock. So drop write lock before calling back into quota code. CC: stable@vger.kernel.org # >= 3.0 Signed-off-by: Jan Kara <jack@suse.cz>	2012-11-19 21:34:33 +01:00
Jan Kara	361d94a338	reiserfs: Protect reiserfs_quota_write() with write lock Calls into reiserfs journalling code and reiserfs_get_block() need to be protected with write lock. We remove write lock around calls to high level quota code in the next patch so these paths would suddently become unprotected. CC: stable@vger.kernel.org # >= 3.0 Signed-off-by: Jan Kara <jack@suse.cz>	2012-11-19 21:34:33 +01:00
Jan Kara	b9e06ef2e8	reiserfs: Protect reiserfs_quota_on() with write lock In reiserfs_quota_on() we do quite some work - for example unpacking tail of a quota file. Thus we have to hold write lock until a moment we call back into the quota code. CC: stable@vger.kernel.org # >= 3.0 Signed-off-by: Jan Kara <jack@suse.cz>	2012-11-19 21:34:32 +01:00
Jan Kara	3bb3e1fc47	reiserfs: Fix lock ordering during remount When remounting reiserfs dquot_suspend() or dquot_resume() can be called. These functions take dqonoff_mutex which ranks above write lock so we have to drop it before calling into quota code. CC: stable@vger.kernel.org # >= 3.0 Signed-off-by: Jan Kara <jack@suse.cz>	2012-11-19 21:34:32 +01:00
Al Viro	3587b1b097	fanotify: fix FAN_Q_OVERFLOW case of fanotify_read() If the FAN_Q_OVERFLOW bit set in event->mask, the fanotify event metadata will not contain a valid file descriptor, but copy_event_to_user() didn't check for that, and unconditionally does a fd_install() on the file descriptor. Which in turn will cause a BUG_ON() in __fd_install(). Introduced by commit `352e3b2492` ("fanotify: sanitize failure exits in copy_event_to_user()") Mea culpa - missed that path ;-/ Reported-by: Alex Shi <lkml.alex@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-11-18 09:30:00 -10:00
Linus Torvalds	8d938105e4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc VFS fixes from Al Viro: "Remove a bogus BUG_ON() that can trigger spuriously + alpha bits of do_mount() constification I'd missed during the merge window." This pull request came in a week ago, I missed it for some reason. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: kill bogus BUG_ON() in do_close_on_exec() missing const in alpha callers of do_mount()	2012-11-18 09:13:48 -10:00
Linus Torvalds	d28d3730fd	xfs: bugfixes for 3.7-rc7 - fix attr tree double split corruption - fix broken error handling in xfs_vm_writepage - drop buffer io reference when a bad bio is built -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJQp7sfAAoJENaLyazVq6ZOWHwP/2WTlenvM74i8HDa/nYW8KTC EubCZ6X1C7LPTV9tm9YUpKZ1VtI1O+OmuGcSmWdBKSMMoBVNyKvWXvrJeVKBVtXV sQ/jh1zCiPYzt9DfxGuarkw8Uy5qKNOYrbEAK1WwPMeOsDODYncfmTm+A/VYMeTt bWOjaxFd5QQOMuf0x9NO/keZc84R5l51ezYxA7HyYa5XvV/MDmLLVL0IhuSTFKyw oOiQMp0hby4zsJg6nqu/eINmdlgBIw+32m8aMSB2jreUQm4yvt0CY7M3Zq6sPmsM 2tC6cFonPw31FBBu9jvv9h5wNz7McyzxtZBS0+zDV+7K0UrIyxWm1BhzZIXoXzLz vHwc4gnZV8nOP/g34aftHLYYRD3ZJhG8mX5AdBRzlWWqDSFvYVEq+1evHrv8kk4l coTapzimNnR3aJ16qdP1M0gExKO9nrGVqrRi8ndLNbxLpxC9mFG7CfJBQPMumukX G8pTV1wQvqONHDNlN4mxqMBHN0d9dGp5xjYQ0Q92/siIA1C5szjCwTHekKNrP6Ol 7xd+nO7Xcgj7Uwaakv31paqOSAGhla6H5jvxPF2A54hZWQqlp88QpChLt3LFPxwh tEYTEf1zRoaoCS4TD3zMYTLY+9cXvUybSIf3hbgns+JMYHJtuZdzbvcaXE6Wl4Jr 6esA5fsBFP1J2/EzpLof =depY -----END PGP SIGNATURE----- Merge tag 'for-linus-v3.7-rc7' of git://oss.sgi.com/xfs/xfs Pull xfs bugfixes from Ben Myers: - fix attr tree double split corruption - fix broken error handling in xfs_vm_writepage - drop buffer io reference when a bad bio is built * tag 'for-linus-v3.7-rc7' of git://oss.sgi.com/xfs/xfs: xfs: drop buffer io reference when a bad bio is built xfs: fix broken error handling in xfs_vm_writepage xfs: fix attr tree double split corruption	2012-11-18 08:29:34 -10:00
Dave Chinner	d69043c42d	xfs: drop buffer io reference when a bad bio is built Error handling in xfs_buf_ioapply_map() does not handle IO reference counts correctly. We increment the b_io_remaining count before building the bio, but then fail to decrement it in the failure case. This leads to the buffer never running IO completion and releasing the reference that the IO holds, so at unmount we can leak the buffer. This leak is captured by this assert failure during unmount: XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 273 This is not a new bug - the b_io_remaining accounting has had this problem for a long, long time - it's just very hard to get a zero length bio being built by this code... Further, the buffer IO error can be overwritten on a multi-segment buffer by subsequent bio completions for partial sections of the buffer. Hence we should only set the buffer error status if the buffer is not already carrying an error status. This ensures that a partial IO error on a multi-segment buffer will not be lost. This part of the problem is a regression, however. cc: <stable@vger.kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2012-11-17 09:36:57 -06:00
Dave Chinner	3daed8bc3e	xfs: fix broken error handling in xfs_vm_writepage When we shut down the filesystem, it might first be detected in writeback when we are allocating a inode size transaction. This happens after we have moved all the pages into the writeback state and unlocked them. Unfortunately, if we fail to set up the transaction we then abort writeback and try to invalidate the current page. This then triggers are BUG() in block_invalidatepage() because we are trying to invalidate an unlocked page. Fixing this is a bit of a chicken and egg problem - we can't allocate the transaction until we've clustered all the pages into the IO and we know the size of it (i.e. whether the last block of the IO is beyond the current EOF or not). However, we don't want to hold pages locked for long periods of time, especially while we lock other pages to cluster them into the write. To fix this, we need to make a clear delineation in writeback where errors can only be handled by IO completion processing. That is, once we have marked a page for writeback and unlocked it, we have to report errors via IO completion because we've already started the IO. We may not have submitted any IO, but we've changed the page state to indicate that it is under IO so we must now use the IO completion path to report errors. To do this, add an error field to xfs_submit_ioend() to pass it the error that occurred during the building on the ioend chain. When this is non-zero, mark each ioend with the error and call xfs_finish_ioend() directly rather than building bios. This will immediately push the ioends through completion processing with the error that has occurred. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2012-11-17 09:35:42 -06:00

1 2 3 4 5 ...

29235 Commits (cf0e3a64cad19acd5904946d0647d751c1671620)