linux/fs/reiserfs
Frederic Weisbecker 9b467e6ebe reiserfs: don't lock root inode searching
Nothing requires that we lock the filesystem until the root inode is
provided.

Also iget5_locked() triggers a warning because we are holding the
filesystem lock while allocating the inode, which result in a lockdep
suspicion that we have a lock inversion against the reclaim path:

[ 1986.896979] =================================
[ 1986.896990] [ INFO: inconsistent lock state ]
[ 1986.896997] 3.1.1-main #8
[ 1986.897001] ---------------------------------
[ 1986.897007] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[ 1986.897016] kswapd0/16 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 1986.897023]  (&REISERFS_SB(s)->lock){+.+.?.}, at: [<c01f8bd4>] reiserfs_write_lock+0x20/0x2a
[ 1986.897044] {RECLAIM_FS-ON-W} state was registered at:
[ 1986.897050]   [<c014a5b9>] mark_held_locks+0xae/0xd0
[ 1986.897060]   [<c014aab3>] lockdep_trace_alloc+0x7d/0x91
[ 1986.897068]   [<c0190ee0>] kmem_cache_alloc+0x1a/0x93
[ 1986.897078]   [<c01e7728>] reiserfs_alloc_inode+0x13/0x3d
[ 1986.897088]   [<c01a5b06>] alloc_inode+0x14/0x5f
[ 1986.897097]   [<c01a5cb9>] iget5_locked+0x62/0x13a
[ 1986.897106]   [<c01e99e0>] reiserfs_fill_super+0x410/0x8b9
[ 1986.897114]   [<c01953da>] mount_bdev+0x10b/0x159
[ 1986.897123]   [<c01e764d>] get_super_block+0x10/0x12
[ 1986.897131]   [<c0195b38>] mount_fs+0x59/0x12d
[ 1986.897138]   [<c01a80d1>] vfs_kern_mount+0x45/0x7a
[ 1986.897147]   [<c01a83e3>] do_kern_mount+0x2f/0xb0
[ 1986.897155]   [<c01a987a>] do_mount+0x5c2/0x612
[ 1986.897163]   [<c01a9a72>] sys_mount+0x61/0x8f
[ 1986.897170]   [<c044060c>] sysenter_do_call+0x12/0x32
[ 1986.897181] irq event stamp: 7509691
[ 1986.897186] hardirqs last  enabled at (7509691): [<c0190f34>] kmem_cache_alloc+0x6e/0x93
[ 1986.897197] hardirqs last disabled at (7509690): [<c0190eea>] kmem_cache_alloc+0x24/0x93
[ 1986.897209] softirqs last  enabled at (7508896): [<c01294bd>] __do_softirq+0xee/0xfd
[ 1986.897222] softirqs last disabled at (7508859): [<c01030ed>] do_softirq+0x50/0x9d
[ 1986.897234]
[ 1986.897235] other info that might help us debug this:
[ 1986.897242]  Possible unsafe locking scenario:
[ 1986.897244]
[ 1986.897250]        CPU0
[ 1986.897254]        ----
[ 1986.897257]   lock(&REISERFS_SB(s)->lock);
[ 1986.897265] <Interrupt>
[ 1986.897269]     lock(&REISERFS_SB(s)->lock);
[ 1986.897276]
[ 1986.897277]  *** DEADLOCK ***
[ 1986.897278]
[ 1986.897286] no locks held by kswapd0/16.
[ 1986.897291]
[ 1986.897292] stack backtrace:
[ 1986.897299] Pid: 16, comm: kswapd0 Not tainted 3.1.1-main #8
[ 1986.897306] Call Trace:
[ 1986.897314]  [<c0439e76>] ? printk+0xf/0x11
[ 1986.897324]  [<c01482d1>] print_usage_bug+0x20e/0x21a
[ 1986.897332]  [<c01479b8>] ? print_irq_inversion_bug+0x172/0x172
[ 1986.897341]  [<c014855c>] mark_lock+0x27f/0x483
[ 1986.897349]  [<c0148d88>] __lock_acquire+0x628/0x1472
[ 1986.897358]  [<c0149fae>] lock_acquire+0x47/0x5e
[ 1986.897366]  [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a
[ 1986.897384]  [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a
[ 1986.897397]  [<c043b5ef>] mutex_lock_nested+0x35/0x26f
[ 1986.897409]  [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a
[ 1986.897421]  [<c01f8bd4>] reiserfs_write_lock+0x20/0x2a
[ 1986.897433]  [<c01e2edd>] map_block_for_writepage+0xc9/0x590
[ 1986.897448]  [<c01b1706>] ? create_empty_buffers+0x33/0x8f
[ 1986.897461]  [<c0121124>] ? get_parent_ip+0xb/0x31
[ 1986.897472]  [<c043ef7f>] ? sub_preempt_count+0x81/0x8e
[ 1986.897485]  [<c043cae0>] ? _raw_spin_unlock+0x27/0x3d
[ 1986.897496]  [<c0121124>] ? get_parent_ip+0xb/0x31
[ 1986.897508]  [<c01e355d>] reiserfs_writepage+0x1b9/0x3e7
[ 1986.897521]  [<c0173b40>] ? clear_page_dirty_for_io+0xcb/0xde
[ 1986.897533]  [<c014a6e3>] ? trace_hardirqs_on_caller+0x108/0x138
[ 1986.897546]  [<c014a71e>] ? trace_hardirqs_on+0xb/0xd
[ 1986.897559]  [<c0177b38>] shrink_page_list+0x34f/0x5e2
[ 1986.897572]  [<c01780a7>] shrink_inactive_list+0x172/0x22c
[ 1986.897585]  [<c0178464>] shrink_zone+0x303/0x3b1
[ 1986.897597]  [<c043cae0>] ? _raw_spin_unlock+0x27/0x3d
[ 1986.897611]  [<c01788c9>] kswapd+0x3b7/0x5f2

The deadlock shouldn't happen since we are doing that allocation in the
mount path, the filesystem is not available for any reclaim.  Still the
warning is annoying.

To solve this, acquire the lock later only where we need it, right before
calling reiserfs_read_locked_inode() that wants to lock to walk the tree.

Reported-by: Knut Petersen <Knut_Petersen@t-online.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-10 16:30:54 -08:00
..
bitmap.c reiserfs: delay reiserfs lock until journal initialization 2012-01-10 16:30:53 -08:00
dir.c fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers 2011-07-20 20:47:59 -04:00
do_balan.c kill-the-bkl/reiserfs: move the concurrent tree accesses checks per superblock 2009-09-14 07:18:25 +02:00
file.c fs: take the ACL checks to common code 2011-07-25 14:30:23 -04:00
fix_node.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
hashes.c
ibalance.c
inode.c reiserfs: propagate umode_t 2012-01-03 22:55:00 -05:00
ioctl.c vfs: mnt_drop_write_file() 2012-01-03 22:52:40 -05:00
item_ops.c
journal.c reiserfs: don't lock journal_init() 2012-01-10 16:30:53 -08:00
Kconfig Update broken web addresses in the kernel. 2010-10-18 11:03:14 +02:00
lbalance.c reiserfs: fix warnings with gcc 4.4 2009-06-18 13:03:46 -07:00
lock.c Fix common misspellings 2011-03-31 11:26:23 -03:00
Makefile fs: change to new flag variable 2011-03-17 14:02:57 +01:00
namei.c reiserfs: propagate umode_t 2012-01-03 22:55:00 -05:00
objectid.c
prints.c reiserfs: make sure va_end() is always called after va_start(). 2011-01-13 08:03:15 -08:00
procfs.c reiserfs: don't compile procfs.o at all if no support 2009-12-16 07:20:06 -08:00
README Update broken web addresses in the kernel. 2010-10-18 11:03:14 +02:00
resize.c fs: Convert vmalloc/memset to vzalloc 2011-09-15 13:56:28 +02:00
stree.c dquot: cleanup space allocation / freeing routines 2010-03-05 00:20:28 +01:00
super.c reiserfs: don't lock root inode searching 2012-01-10 16:30:54 -08:00
tail_conversion.c
xattr.c switch vfs_mkdir() and ->mkdir() to umode_t 2012-01-03 22:54:53 -05:00
xattr_acl.c switch posix_acl_equiv_mode() to umode_t * 2011-08-01 02:10:06 -04:00
xattr_security.c security: new security_inode_init_security API adds function callback 2011-07-18 12:29:38 -04:00
xattr_trusted.c reiserfs: constify xattr_handler 2010-05-21 18:31:19 -04:00
xattr_user.c reiserfs: constify xattr_handler 2010-05-21 18:31:19 -04:00

[LICENSING]

ReiserFS is hereby licensed under the GNU General
Public License version 2.

Source code files that contain the phrase "licensing governed by
reiserfs/README" are "governed files" throughout this file.  Governed
files are licensed under the GPL.  The portions of them owned by Hans
Reiser, or authorized to be licensed by him, have been in the past,
and likely will be in the future, licensed to other parties under
other licenses.  If you add your code to governed files, and don't
want it to be owned by Hans Reiser, put your copyright label on that
code so the poor blight and his customers can keep things straight.
All portions of governed files not labeled otherwise are owned by Hans
Reiser, and by adding your code to it, widely distributing it to
others or sending us a patch, and leaving the sentence in stating that
licensing is governed by the statement in this file, you accept this.
It will be a kindness if you identify whether Hans Reiser is allowed
to license code labeled as owned by you on your behalf other than
under the GPL, because he wants to know if it is okay to do so and put
a check in the mail to you (for non-trivial improvements) when he
makes his next sale.  He makes no guarantees as to the amount if any,
though he feels motivated to motivate contributors, and you can surely
discuss this with him before or after contributing.  You have the
right to decline to allow him to license your code contribution other
than under the GPL.

Further licensing options are available for commercial and/or other
interests directly from Hans Reiser: hans@reiser.to.  If you interpret
the GPL as not allowing those additional licensing options, you read
it wrongly, and Richard Stallman agrees with me, when carefully read
you can see that those restrictions on additional terms do not apply
to the owner of the copyright, and my interpretation of this shall
govern for this license.

Finally, nothing in this license shall be interpreted to allow you to
fail to fairly credit me, or to remove my credits, without my
permission, unless you are an end user not redistributing to others.
If you have doubts about how to properly do that, or about what is
fair, ask.  (Last I spoke with him Richard was contemplating how best
to address the fair crediting issue in the next GPL version.)

[END LICENSING]

Reiserfs is a file system based on balanced tree algorithms, which is
described at https://reiser4.wiki.kernel.org/index.php/Main_Page 

Stop reading here.  Go there, then return.

Send bug reports to yura@namesys.botik.ru.

mkreiserfs and other utilities are in reiserfs/utils, or wherever your
Linux provider put them.  There is some disagreement about how useful
it is for users to get their fsck and mkreiserfs out of sync with the
version of reiserfs that is in their kernel, with many important
distributors wanting them out of sync.:-) Please try to remember to
recompile and reinstall fsck and mkreiserfs with every update of
reiserfs, this is a common source of confusion.  Note that some of the
utilities cannot be compiled without accessing the balancing code
which is in the kernel code, and relocating the utilities may require
you to specify where that code can be found.

Yes, if you update your reiserfs kernel module you do have to
recompile your kernel, most of the time.  The errors you get will be
quite cryptic if your forget to do so.

Real users, as opposed to folks who want to hack and then understand
what went wrong, will want REISERFS_CHECK off.

Hideous Commercial Pitch: Spread your development costs across other OS
vendors.  Select from the best in the world, not the best in your
building, by buying from third party OS component suppliers.  Leverage
the software component development power of the internet.  Be the most
aggressive in taking advantage of the commercial possibilities of
decentralized internet development, and add value through your branded
integration that you sell as an operating system.  Let your competitors
be the ones to compete against the entire internet by themselves.  Be
hip, get with the new economic trend, before your competitors do.  Send
email to hans@reiser.to.

To understand the code, after reading the website, start reading the
code by reading reiserfs_fs.h first.

Hans Reiser was the project initiator, primary architect, source of all
funding for the first 5.5 years, and one of the programmers.  He owns
the copyright.

Vladimir Saveljev was one of the programmers, and he worked long hours
writing the cleanest code.  He always made the effort to be the best he
could be, and to make his code the best that it could be.  What resulted
was quite remarkable. I don't think that money can ever motivate someone
to work the way he did, he is one of the most selfless men I know.

Yura helps with benchmarking, coding hashes, and block pre-allocation
code.

Anatoly Pinchuk is a former member of our team who worked closely with
Vladimir throughout the project's development.  He wrote a quite
substantial portion of the total code.  He realized that there was a
space problem with packing tails of files for files larger than a node
that start on a node aligned boundary (there are reasons to want to node
align files), and he invented and implemented indirect items and
unformatted nodes as the solution.

Konstantin Shvachko, with the help of the Russian version of a VC,
tried to put me in a position where I was forced into giving control
of the project to him.  (Fortunately, as the person paying the money
for all salaries from my dayjob I owned all copyrights, and you can't
really force takeovers of sole proprietorships.)  This was something
curious, because he never really understood the value of our project,
why we should do what we do, or why innovation was possible in
general, but he was sure that he ought to be controlling it.  Every
innovation had to be forced past him while he was with us.  He added
two years to the time required to complete reiserfs, and was a net
loss for me.  Mikhail Gilula was a brilliant innovator who also left
in a destructive way that erased the value of his contributions, and
that he was shown much generosity just makes it more painful.

Grigory Zaigralin was an extremely effective system administrator for
our group.

Igor Krasheninnikov was wonderful at hardware procurement, repair, and
network installation.

Jeremy Fitzhardinge wrote the teahash.c code, and he gives credit to a
textbook he got the algorithm from in the code.  Note that his analysis
of how we could use the hashing code in making 32 bit NFS cookies work
was probably more important than the actual algorithm.  Colin Plumb also
contributed to it.

Chris Mason dived right into our code, and in just a few months produced
the journaling code that dramatically increased the value of ReiserFS.
He is just an amazing programmer.

Igor Zagorovsky is writing much of the new item handler and extent code
for our next major release.

Alexander Zarochentcev (sometimes known as zam, or sasha), wrote the
resizer, and is hard at work on implementing allocate on flush.  SGI
implemented allocate on flush before us for XFS, and generously took
the time to convince me we should do it also.  They are great people,
and a great company.

Yuri Shevchuk and Nikita Danilov are doing squid cache optimization.

Vitaly Fertman is doing fsck.

Jeff Mahoney, of SuSE, contributed a few cleanup fixes, most notably
the endian safe patches which allow ReiserFS to run on any platform
supported by the Linux kernel.

SuSE, IntegratedLinux.com, Ecila, MP3.com, bigstorage.com, and the
Alpha PC Company made it possible for me to not have a day job
anymore, and to dramatically increase our staffing.  Ecila funded
hypertext feature development, MP3.com funded journaling, SuSE funded
core development, IntegratedLinux.com funded squid web cache
appliances, bigstorage.com funded HSM, and the alpha PC company funded
the alpha port.  Many of these tasks were helped by sponsors other
than the ones just named.  SuSE has helped in much more than just
funding....