Under some cases, when scrubbing the PEB if we did not get the lock on
the PEB it fails to scrub. Add that PEB again to the scrub list
Artem: minor amendments.
Cc: stable@kernel.org [2.6.31+]
Signed-off-by: Bhavesh Parekh <bparekh@nvidia.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Currently UBI erases all corrupted eraseblocks, irrespectively of the nature
of corruption: corruption due to power cuts and non-power cut corruption.
The former case is OK, but the latter is not, because UBI may destroy
potentially important data.
With this patch, during scanning, when UBI hits a PEB with corrupted VID
header, it checks whether this PEB contains only 0xFF data. If yes, it is
safe to erase this PEB and it is put to the 'erase' list. If not, this may
be important data and it is better to avoid erasing this PEB. Instead,
UBI puts it to the corr list and moves out of the pool of available PEB.
IOW, UBI preserves this PEB.
Such corrupted PEB lessen the amount of available PEBs. So the more of them
we accumulate, the less PEBs are available. The maximum amount of non-power
cut corrupted PEBs is 8.
This patch is a response to UBIFS problem where reporter
(Matthew L. Creech <mlcreech@gmail.com>) observes that UBIFS index points
to an unmapped LEB. The theory is that corresponding PEB somehow got
corrupted and UBI wiped it. This patch (actually a series of patches)
tries to make sure such PEBs are preserved - this would make it is easier
to analyze the corruption.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Rename UBI_IO_BAD_HDR_READ into UBI_IO_BAD_HDR_EBADMSG which is presumably more
self-documenting and readable. Indeed, the '_READ' suffix does not tell much and
even confuses, while '_EBADMSG' tells about uncorrectable ECC error, because we
use -EBADMSG all over the place to represent ECC errors.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Currently, when UBI attaches an MTD device and cannot reserve all 1% (by
default) of PEBs for bad eraseblocks handling, it prints a warning. However,
Matthew L. Creech <mlcreech@gmail.com> is not very happy to see this warning,
because he did reserve enough of PEB at the beginning, but with time some
PEBs became bad. The warning is not necessary in this case.
This patch makes UBI print the warning
o if this is a new image
o of this is used image and the amount of reserved PEBs is only 10% (or less)
of the size of the reserved PEB pool.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This patch introduces the %UBI_IO_BAD_HDR_READ return code for
the I/O level function. We will use this code in order to distinguish
between "corrupted header possibly because this is non-ubi data" and
"corrupted header possibly because of real data corruption and ECC error".
So far this patch does not introduce any functional change, just a
preparation.
This patch is pased on a patch from
Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Reviewed-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Tested-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
We do not really need 2 separate error codes for indicating bad VID
and bad EC headers (UBI_IO_BAD_EC_HDR, UBI_IO_BAD_VID_HDR), it is
enough to have only one UBI_IO_BAD_HDR return code.
This patch does not introduce any functional change, only some
code simplification.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Reviewed-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Tested-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
If we fail in 'ubi_eba_init_scan()', we free
'ubi->volumes[i]->eba_tbl' in there, but also later free it
in 'free_internal_volumes()'. Fix this by assigning NULL
to 'ubi->volumes[i]->eba_tbl' after it is freed.
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The UBIFS WL worker may encounter read errors and there is logic
which makes a decision whether we should do one of:
1. cancel the operation and move the PEB with the read errors to
the 'erroneous' list;
2. switch to R/O mode.
ATM, only -EIO errors trigger 1., other errors trigger 2. The idea
is that if we know we encountered an I/O error, do 1. Otherwise,
we do not know how to react, and do 2., just in case. E.g., if
the underlying driver became crazy because of a bug, we do not
want to harm any data, and switch to R/O mode.
This patch does 2 things:
1. Makes sure reads from the source PEB always cause 1. This is
more consistent with other reads which come from the upper
layers and never cause R/O.
2. Teaches UBI to do 1. also on -EBADMSG, UBI_IO_BAD_VID_HDR,
-ENOMEM, and -ETIMEOUT. But this is only when reading the
target PEB.
This preblems were hunted by Adrian Hunter.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This patch improves UBI errors handling. ATM UBI switches to
R/O mode when the WL worker fails to read the source PEB.
This means that the upper layers (e.g., UBIFS) has no
chances to unmap the erroneous PEB and fix the error.
This patch changes this behaviour and makes UBI put PEBs
like this into a separate RB-tree, thus preventing the
WL worker from hitting the same read errors again and
again.
But there is a 10% limit on a maximum amount of PEBs like this.
If there are too much of them, UBI switches to R/O mode.
Additionally, this patch teaches UBI not to panic and
switch to R/O mode if after a PEB has been copied, the
target LEB cannot be read back. Instead, now UBI cancels
the operation and schedules the target PEB for torturing.
The error paths has been tested by ingecting errors
into 'ubi_eba_copy_leb()'.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This patch fixes the error path in the WL worker - in same cases
UBI oopses when 'goto out_error' happens and e1 or e2 are NULL.
This patch also cleans up the error paths a little. And I have
tested nearly all error paths in the WL worker.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This patch is a clean-up and a preparation for the following
patches. It introduece constants for the return values of the
'ubi_eba_copy_leb()' function.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
When a PEB is moved and a write error happens, UBI switches
to R/O mode, which is wrong, because we just copy the data
and may select a different PEB and re-try this. This patch
fixes WL worker's behavior.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
We cannot call 'ubi_wl_get_peb()' with @ubi->buf_mutex locked,
because 'ubi_wl_get_peb()' may force erasure, which, in turn,
may call 'torture_peb()' which also locks the @ubi->buf_mutex
and deadlocks.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
'ubi_io_read_data()' may return EBADMSG in case of an ECC error,
and we should not panic because of this. We have CRC32 checksum
and may check the data. So just ignore the EBADMSG error.
This patch also fixes a minor spelling error at the same time.
Signed-off-by: Zoltan Sogor <weth@inf.u-szeged.hu>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Just out or curiousity ran checkpatch.pl for whole UBI,
and discovered there are quite a few of stylistic issues.
Fix them.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This is probably a copy-paste bug - we torture the old PEB
in the atomic LEB change function, but we should not do this.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Hch asked not to use "unit" for sub-systems, let it be so.
Also some other commentaries modifications.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
leb_read_unlock() may be called simultaniously by several tasks.
The would race at the following code:
up_read(&le->mutex);
if (free)
kfree(le);
And it is possible that one task frees 'le' before the other tasks
do 'up_read()'. Fix this by doing up_read and free inside the
'ubi->ltree' lock. Below it the oops we had because of this:
BUG: spinlock bad magic on CPU#0, integck/7504
BUG: unable to handle kernel paging request at 6b6b6c4f
IP: [<c0211221>] spin_bug+0x5c/0xdb
*pde = 00000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: ubifs ubi nandsim nand nand_ids nand_ecc video output
Pid: 7504, comm: integck Not tainted (2.6.26-rc3ubifs26 #8)
EIP: 0060:[<c0211221>] EFLAGS: 00010002 CPU: 0
EIP is at spin_bug+0x5c/0xdb
EAX: 00000032 EBX: 6b6b6b6b ECX: 6b6b6b6b EDX: f7f7ce30
ESI: f76491dc EDI: c044f51f EBP: e8a736cc ESP: e8a736a8
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process integck (pid: 7504, ti=e8a72000 task=f7f7ce30 task.ti=e8a72000)
Stack: c044f754 c044f51f 00000000 f7f7d024 00001d50 00000001 f76491dc 00000296 f6df50e0 e8a736d8 c02112f0 f76491dc e8a736e8 c039157a f7d9e830 f76491d8 e8a7370c c020b975 f76491dc 00000296 f76491f8 00000000 f76491d8 00000000 Call Trace:
[<c02112f0>] ? _raw_spin_unlock+0x50/0x7c
[<c039157a>] ? _spin_unlock_irqrestore+0x20/0x58
[<c020b975>] ? rwsem_wake+0x4b/0x122
[<c0390e0a>] ? call_rwsem_wake+0xa/0xc
[<c0139ee7>] ? up_read+0x28/0x31
[<f8873b3c>] ? leb_read_unlock+0x73/0x7b [ubi]
[<f88742a3>] ? ubi_eba_read_leb+0x195/0x2b0 [ubi]
[<f8872a04>] ? ubi_leb_read+0xaf/0xf8 [ubi]
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
ubi_free_volume() function sets ubi->volumes[] to NULL, so
ubi_eba_close() is useless, it does not free what has to be freed.
So zap it and free vol->eba_tbl at the volume release function.
Pointed-out-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
UBI already checks that @min io size is the power of 2 at io_init.
It is save to use bit operations then.
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Add more information about layout volume to make userspace tools
use the macros instead of constants. Also rename UBI_LAYOUT_VOL_ID
to make it consistent with other macros.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The problem: NAND flashes have different amount of initial bad physical
eraseblocks (marked as bad by the manufacturer). For example, for 256MiB
Samsung OneNAND flash there might be from 0 to 40 bad initial eraseblocks,
which is about 2%. When UBI is used as the base system, one needs to know
the exact amount of good physical eraseblocks, because this number is
needed to create the UBI image which is put to the devices during
production. But this number is not know, which forces us to use the
minimum number of good physical eraseblocks. And UBI additionally
reserves some percentage of physical eraseblocks for bad block handling
(default is 1%), so we have 1-3% of PEBs reserved at the end, depending
on the amount of initial bad PEBs. But it is desired to always have
1% (or more, depending on the configuration).
Solution: this patch adds an "auto-resize" flag to the volume table.
The volume which has the "auto-resize" flag will automatically be re-sized
(enlarged) on the first UBI initialization. UBI clears the flag when
the volume is re-sized. Only one volume may have the "auto-resize" flag.
So, the production UBI image may have one volume with "auto-resize"
flag set, and its size is automatically adjusted on the first boot
of the device.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This slab cache is not really needed since the number of objects
is low and the constructor does not make much sense because we
allocate oblects when doint I/O, which is way slower then allocation.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This is one more step on the way to "removable" UBI devices. It
adds reference counting for UBI devices. Every time a volume on
this device is opened - the device's refcount is increased. It
is also increased if someone is reading any sysfs file of this
UBI device or of one of its volumes.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
When the WL worker is moving an LEB, the volume might go away
occasionally. UBI does not handle these situations correctly.
This patch introduces a new mutex which serializes wear-levelling
worker and the the 'ubi_wl_put_peb()' function. Now, if one puts
an LEB, and its PEB is being moved, it will wait on the mutex.
And because we unmap all LEBs when removing volumes, this will make
the volume remove function to wait while the LEB movement
finishes.
Below is an example of an oops which should be fixed by this patch:
Pid: 9167, comm: io_paral Not tainted (2.6.24-rc5-ubi-2.6.git #2)
EIP: 0060:[<f884a379>] EFLAGS: 00010246 CPU: 0
EIP is at prot_tree_del+0x2a/0x63 [ubi]
EAX: f39a90e0 EBX: 00000000 ECX: 00000000 EDX: 00000134
ESI: f39a90e0 EDI: f39a90e0 EBP: f2d55ddc ESP: f2d55dd4
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process io_paral (pid: 9167, ti=f2d54000 task=f72a8030 task.ti=f2d54000)
Stack: f39a95f8 ef6aae50 f2d55e08 f884a511 f88538e1 f884ecea 00000134 00000000
f39a9604 f39a95f0 efea8280 00000000 f39a90e0 f2d55e40 f8847261 f8850c3c
f884eaad 00000001 000000b9 00000134 00000172 000000b9 00000134 00000001
Call Trace:
[<c0105227>] show_trace_log_lvl+0x1a/0x30
[<c01052e2>] show_stack_log_lvl+0xa5/0xca
[<c01053d6>] show_registers+0xcf/0x21b
[<c0105648>] die+0x126/0x224
[<c0119a62>] do_page_fault+0x27f/0x60d
[<c037dd62>] error_code+0x72/0x78
[<f884a511>] ubi_wl_put_peb+0xf0/0x191 [ubi]
[<f8847261>] ubi_eba_unmap_leb+0xaf/0xcc [ubi]
[<f8843c21>] ubi_remove_volume+0x102/0x1e8 [ubi]
[<f8846077>] ubi_cdev_ioctl+0x22a/0x383 [ubi]
[<c017d768>] do_ioctl+0x68/0x71
[<c017d7c6>] vfs_ioctl+0x55/0x271
[<c017da15>] sys_ioctl+0x33/0x52
[<c0104152>] sysenter_past_esp+0x5f/0xa5
=======================
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Add ref_count field to UBI volumes and remove weired "vol->removed"
field. This way things are better understandable and we do not have
to do whold show_attr operation under spinlock.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Pass volume description object to the EBA function which makes
more sense, and EBA function do not have to find the volume
description object by volume ID.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Since the ltree_entry slab cache is a global entity, which is
used by all UBI devices, it is more logical to create it on
module initialization time and destro on module exit time.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The idea of this interface belongs to Adrian Hunter. The
interface is extremely useful when one has to have a guarantee
that an LEB will contain all 0xFFs even in case of an unclean
reboot. UBI does have an 'ubi_leb_erase()' call which may do
this, but it is stupid and ineffecient, because it flushes whole
queue. I should be re-worked to just be a pair of unmap,
map calls.
The user of the interfaci is UBIFS at the moment.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
First allocate the necessary eraseblocks, then the optional ones.
Otherwise it allocates all PEBs for bad EB handling, and fails
on then following EBA LEB allocation.
Reported-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Slab constructors currently have a flags parameter that is never used. And
the order of the arguments is opposite to other slab functions. The object
pointer is placed before the kmem_cache pointer.
Convert
ctor(void *object, struct kmem_cache *s, unsigned long flags)
to
ctor(struct kmem_cache *s, void *object)
throughout the kernel
[akpm@linux-foundation.org: coupla fixes]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the following warning:
drivers/mtd/ubi/eba.c: In function 'ubi_eba_init_scan':
drivers/mtd/ubi/eba.c:1116: warning: 'err' may be used uninitialized in this function
Pointed-to-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
When the UBI device is nearly full, i.e. all LEBs are mapped, we have
only one spare LEB left - the one we reserved for WL purposes. Well,
I do not count the LEBs which were reserved for bad PEB handling -
suppose NOR flash for simplicity. If an "atomic LEB change operation"
is run, and the WL unit is moving a LEB, we have no spare LEBs to
finish the operation and fail, which is not good. Moreover, if there
are 2 or more simultanious "atomic LEB change" requests, only one of
them has chances to succeed, the other will fail with -ENOSPC. Not
good either.
This patch does 2 things:
1. Reserves one PEB for the "atomic LEB change" operation.
2. Serealize the operations so that only on of them may run
at a time (by means of a mutex).
Pointed-to-by: Brijesh Singh <brijesh.s.singh@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Similar reason as in case of the previous patch: it causes
deadlocks if a filesystem with writeback support works on top
of UBI. So pre-allocate needed buffers when attaching MTD device.
We also need mutexes to protect the buffers, but they do not
cause much contantion because they are used in recovery, torture,
and WL copy routines, which are called seldom.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Use GFP_NOFS flag when allocating memory on I/O path, because otherwise
we may deadlock the filesystem which works on top of us. We observed
the deadlocks with UBIFS. Example:
VFS->FS lock a lock->UBI->kmalloc()->VFS writeback->FS locks the same
lock again.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Slab destructors were no longer supported after Christoph's
c59def9f22 change. They've been
BUGs for both slab and slub, and slob never supported them
either.
This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
atomic_leb_change() is only allowed for dynamic volumes, so set
the volume type correctly.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Do not call 'ubi_wl_put_peb()' if the LEB was unmapped.
Reported-by: Gabor Loki <loki@inf.u-szeged.hu>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Kill UBI's homegrown endianess handling and replace it with
the standard kernel endianess handling.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
UBI allocates temporary buffers of PEB size, which may be 256KiB.
Use vmalloc instead of kmalloc for such big temporary buffers.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Mark variables in drivers/* with uninitialized_var() if such a warning
appears, and analysis proves that the var is initialized properly on all
paths it is used.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Steven French <sfrench@us.ibm.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@ucw.cz>
Cc: David Chinner <dgc@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by
SLAB.
I think its purpose was to have a callback after an object has been freed
to verify that the state is the constructor state again? The callback is
performed before each freeing of an object.
I would think that it is much easier to check the object state manually
before the free. That also places the check near the code object
manipulation of the object.
Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
compiled with SLAB debugging on. If there would be code in a constructor
handling SLAB_DEBUG_INITIAL then it would have to be conditional on
SLAB_DEBUG otherwise it would just be dead code. But there is no such code
in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real
use of, difficult to understand and there are easier ways to accomplish the
same effect (i.e. add debug code before kfree).
There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
clear in fs inode caches. Remove the pointless checks (they would even be
pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.
This is the last slab flag that SLUB did not support. Remove the check for
unimplemented flags from SLUB.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>