linux/Documentation
David Howells 9f10523f89 FS-Cache: Fix operation state management and accounting
Fix the state management of internal fscache operations and the accounting of
what operations are in what states.

This is done by:

 (1) Give struct fscache_operation a enum variable that directly represents the
     state it's currently in, rather than spreading this knowledge over a bunch
     of flags, who's processing the operation at the moment and whether it is
     queued or not.

     This makes it easier to write assertions to check the state at various
     points and to prevent invalid state transitions.

 (2) Add an 'operation complete' state and supply a function to indicate the
     completion of an operation (fscache_op_complete()) and make things call
     it.  The final call to fscache_put_operation() can then check that an op
     in the appropriate state (complete or cancelled).

 (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
     govern the state of an object:

	(a) The ->n_ops is now the number of extant operations on the object
	    and is now decremented by fscache_put_operation() only.

	(b) The ->n_in_progress is simply the number of objects that have been
	    taken off of the object's pending queue for the purposes of being
	    run.  This is decremented by fscache_op_complete() only.

	(c) The ->n_exclusive is the number of exclusive ops that have been
	    submitted and queued or are in progress.  It is decremented by
	    fscache_op_complete() and by fscache_cancel_op().

     fscache_put_operation() and fscache_operation_gc() now no longer try to
     clean up ->n_exclusive and ->n_in_progress.  That was leading to double
     decrements against fscache_cancel_op().

     fscache_cancel_op() now no longer decrements ->n_ops.  That was leading to
     double decrements against fscache_put_operation().

     fscache_submit_exclusive_op() now decides whether it has to queue an op
     based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
     will persist in being true even after all preceding operations have been
     cancelled or completed.  Furthermore, if an object is active and there are
     runnable ops against it, there must be at least one op running.

 (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
     provide a function to record completion of the pages as they complete.

     When n_pages reaches 0, the operation is deemed to be complete and
     fscache_op_complete() is called.

     Add calls to fscache_retrieval_complete() anywhere we've finished with a
     page we've been given to read or allocate for.  This includes places where
     we just return pages to the netfs for reading from the server and where
     accessing the cache fails and we discard the proposed netfs page.

The bugs in the unfixed state management manifest themselves as oopses like the
following where the operation completion gets out of sync with return of the
cookie by the netfs.  This is possible because the cache unlocks and returns
all the netfs pages before recording its completion - which means that there's
nothing to stop the netfs discarding them and returning the cookie.


FS-Cache: Cookie 'NFS.fh' still has outstanding reads
------------[ cut here ]------------
kernel BUG at fs/fscache/cookie.c:519!
invalid opcode: 0000 [#1] SMP
CPU 1
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc

Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090                  /DG965RY
RIP: 0010:[<ffffffffa007050a>]  [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache]
RSP: 0018:ffff8800368cfb00  EFLAGS: 00010282
RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
FS:  0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
Stack:
 ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
Call Trace:
 [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
 [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs]
 [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs]
 [<ffffffff810d8d47>] evict+0xa1/0x15c
 [<ffffffff810d8e2e>] dispose_list+0x2c/0x38
 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b
 [<ffffffff810c56b7>] prune_super+0xd5/0x140
 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab
 [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595
 [<ffffffff8103e009>] ? process_timeout+0xb/0xb
 [<ffffffff8109dba3>] kswapd+0x270/0x289
 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46
 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595
 [<ffffffff8104bf7a>] kthread+0x7f/0x87
 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe
 [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53
 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb

Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-20 21:58:26 +00:00
..
ABI Nothing all that exciting; a new module-from-fd syscall for those who want 2012-12-19 07:55:08 -08:00
DocBook kstrto*: add documentation 2012-12-17 17:15:22 -08:00
EDID
PCI PCI: SRIOV control and status via sysfs (documentation) 2012-11-28 11:25:47 -07:00
RCU Merge branches 'urgent.2012.10.27a', 'doc.2012.11.16a', 'fixes.2012.11.13a', 'srcu.2012.10.27a', 'stall.2012.11.13a', 'tracing.2012.11.08a' and 'idle.2012.10.24a' into HEAD 2012-11-16 09:59:58 -08:00
accounting doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c 2012-11-26 14:22:21 +01:00
acpi Merge branch 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-12-14 10:03:23 -08:00
aoe aoe: allow user to disable target failure timeout 2012-12-17 17:15:25 -08:00
arm fbdev changes for 3.8: 2012-12-15 13:03:48 -08:00
arm64 Documentation: Fixes a word in Documentation/arm64/memory.txt 2012-11-29 16:33:18 +00:00
auxdisplay
backlight drivers/video/backlight/lp855x_bl.c: use generic PWM functions 2012-12-17 17:15:16 -08:00
blackfin Documentation: Fix typo in multiple files in Documentation 2012-04-16 14:37:13 +02:00
block block: Kill bi_destructor 2012-09-09 10:35:39 +02:00
blockdev
bus-devices ARM: OMAP2+: gpmc: generic timing calculation 2012-11-09 18:07:11 +05:30
cdrom
cgroups kmem: add slab-specific documentation about the kmem controller 2012-12-18 15:02:15 -08:00
connector connector: Move cn_test.c away from NLMSG_PUT(). 2012-06-26 21:19:02 -07:00
console
cpu-freq acpi-cpufreq: Add support for disabling dynamic overclocking 2012-09-09 22:05:12 +02:00
cpuidle Honor state disabling in the cpuidle ladder governor 2012-09-04 01:35:44 +02:00
cris
crypto KEYS: Document asymmetric key type 2012-10-08 13:50:12 +10:30
development-process
device-mapper DM RAID: Add rebuild capability for RAID10 2012-10-11 13:40:24 +11:00
devicetree Device tree v3.8 bug fix branch. Fixes an undefined struct device build 2012-12-19 20:26:16 -08:00
driver-model pwm: add devm_pwm_get() and devm_pwm_put() 2012-09-10 17:05:45 +02:00
dvb get_dvb_firmware: fix download site for tda10046 firmware 2012-09-28 16:16:00 -03:00
early-userspace
extcon Documentation/extcon: porting guide for Android kernel switch driver. 2012-04-20 09:24:27 -07:00
fault-injection doc: fix quite a few typos within Documentation 2012-11-19 14:28:24 +01:00
fb
filesystems FS-Cache: Fix operation state management and accounting 2012-12-20 21:58:26 +00:00
firmware_class firmware loader: document firmware cache mechanism 2012-11-14 15:07:18 -08:00
frv
hid doc: fix quite a few typos within Documentation 2012-11-19 14:28:24 +01:00
hwmon hwmon: (it87) Report thermal sensor type as Intel PECI if appropriate 2012-12-19 22:17:02 +01:00
i2c i2c: Mention functionality flags in SMBus protocol documentation 2012-12-16 21:11:55 +01:00
i2o
ia64 doc: aliasing-test: close fd on write error 2012-09-01 09:57:10 -07:00
ide
infiniband IB/ipoib: Add rtnl_link_ops support 2012-09-20 16:49:17 -04:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid 2012-12-13 12:00:48 -08:00
ioctl ioctl-number.txt: Remove legacy private ioctl's from media drivers 2012-08-14 00:07:39 -03:00
isdn
ja_JP
kbuild Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-12-13 12:00:02 -08:00
kdump kexec: update URL of kexec homepage 2012-07-18 18:35:57 -07:00
ko_KR
laptops Documentation: fix the VM knobs descritpion WRT pdflush 2012-08-04 12:15:09 +04:00
leds leds-lp5523: add channel name in the platform data 2012-09-11 18:32:41 +08:00
m68k
make
memory-devices memory: emif: add basic infrastructure for EMIF driver 2012-05-02 00:10:49 -07:00
mips
misc-devices doc: fix quite a few typos within Documentation 2012-11-19 14:28:24 +01:00
mmc mmc: core: Extend sysfs to ext_csd parameters for RPMB support 2012-12-06 13:54:48 -05:00
mn10300
mtd
namespaces
netlabel
networking doc: Tighten-up and clarify description of tcp_fin_timeout 2012-12-10 17:14:28 -05:00
nfc NFC: Error management documentation 2012-07-09 16:42:11 -04:00
parisc Documentation: Fix typo in multiple files in Documentation 2012-04-16 14:37:13 +02:00
pcmcia
power Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux 2012-12-11 22:15:57 -08:00
powerpc powerpc/hw-breakpoint: Use generic hw-breakpoint interfaces for new PPC ptrace flags 2012-11-15 13:00:23 +11:00
pps
prctl seccomp: Make syscall skipping and nr changes more consistent 2012-10-02 21:14:29 +10:00
pti
ptp
rapidio
s390
scheduler sched: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW 2012-09-13 16:52:04 +02:00
scsi [SCSI] hptiop: Support HighPoint RR4520/RR4522 HBA 2012-11-27 08:59:43 +04:00
security Documentation: fix Documentation/security/00-INDEX 2012-12-17 17:15:22 -08:00
serial firmware: remove computone driver firmware and documentation 2012-08-16 12:31:18 -07:00
sh
sound ALSA: usb-audio: Deprecate async_unlink option 2012-11-21 11:37:40 +01:00
spi spi: Updates for v3.7 2012-10-02 17:26:42 -07:00
sysctl coredump: add support for %d=__get_dumpable() in core name 2012-10-06 03:05:15 +09:00
target target: Simplify fabric sense data length handling 2012-09-17 17:12:58 -07:00
thermal Thermal: Add documentation for platform layer data 2012-11-05 14:00:09 +08:00
timers
trace doc: fix old config name of kprobetrace 2012-09-27 12:11:29 +02:00
usb USB: report submission of active URBs 2012-11-11 18:10:46 -08:00
vDSO
video4linux doc: fix quite a few typos within Documentation 2012-11-19 14:28:24 +01:00
virtual KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface 2012-12-06 01:34:20 +01:00
vm Merge branch 'akpm' (Andrew's patch-bomb) 2012-12-13 13:11:15 -08:00
w1 1-Wire: Add support for the maxim ds1825 temperature sensor 2012-08-16 12:33:59 -07:00
watchdog watchdog: fix watchdog-test.c build warning 2012-08-29 17:12:58 +02:00
wimax
x86 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-12-19 12:56:42 -08:00
xtensa xtensa: initialize atomctl SR 2012-12-18 21:10:22 -08:00
zh_CN Documentation:Update Documentation/zh_CN/arm64/memory.txt 2012-11-26 16:25:37 -08:00
.gitignore
00-INDEX Documentation: remove reference to feature-removal-schedule.txt 2012-12-17 17:15:12 -08:00
BUG-HUNTING
Changes
CodingStyle CodingStyle: add networking specific block comment style 2012-10-06 03:04:59 +09:00
DMA-API-HOWTO.txt
DMA-API.txt
DMA-ISA-LPC.txt
DMA-attributes.txt common: DMA-mapping: add DMA_ATTR_FORCE_CONTIGUOUS attribute 2012-11-29 03:30:34 -08:00
HOWTO HOWTO: fix double words typo 2012-12-10 15:54:27 +01:00
IPMI.txt IPMI: Remove SMBus driver info from the docs 2012-10-16 18:07:12 -07:00
IRQ-affinity.txt
IRQ-domain.txt irqdomain: update documentation 2012-12-05 23:52:10 +00:00
IRQ.txt
Intel-IOMMU.txt
Makefile mei: move doc files Documentation/misc-devices/mei 2012-05-09 13:59:09 -07:00
ManagementStyle Documentation: ManagementStyle: fixed typo 2012-06-28 12:03:15 +02:00
SAK.txt
SM501.txt
SecurityBugs
SubmitChecklist
SubmittingDrivers
SubmittingPatches Documentation/SubmittingPatches: suggested the use of scripts/get_maintainer.pl 2012-05-25 16:18:30 +02:00
VGA-softcursor.txt
applying-patches.txt
atomic_ops.txt
bad_memory.txt
basic_profiling.txt
binfmt_misc.txt
braille-console.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
cachetlb.txt
circular-buffers.txt
clk.txt
coccinelle.txt
cpu-hotplug.txt doc: Add x86 CPU0 online/offline feature 2012-11-14 09:39:44 -08:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt
devices.txt firmware: remove last vestiges of dabusb 2012-11-21 13:03:01 -08:00
digsig.txt
dma-buf-sharing.txt doc: fix quite a few typos within Documentation 2012-11-19 14:28:24 +01:00
dmaengine.txt
dontdiff x86: remove offsets.h from .gitignore and dontdiff 2012-11-19 14:10:53 +01:00
dynamic-debug-howto.txt dynamic_debug: update Documentation/*, Kconfig.debug 2012-04-30 16:26:30 -04:00
edac.txt Merge branch 'devel' 2012-07-29 21:11:05 -03:00
eisa.txt MCA: delete all remaining traces of microchannel bus support. 2012-05-17 19:06:13 -04:00
email-clients.txt
flexible-arrays.txt
futex-requeue-pi.txt
gcov.txt
gpio.txt gpiolib: provide provision to register pin ranges 2012-11-11 19:06:00 +01:00
highuid.txt
hw_random.txt
hwspinlock.txt
init.txt
initrd.txt Documentation/initrd.txt: Change the location of util-linux 2012-05-25 16:18:34 +02:00
intel_txt.txt
io-mapping.txt
io_ordering.txt
iostats.txt
irqflags-tracing.txt
isapnp.txt
java.txt
kernel-doc-nano-HOWTO.txt Kernel-doc: Convention: Use a "Return" section to describe return values 2012-11-27 21:08:57 +01:00
kernel-docs.txt
kernel-parameters.txt Documentation/kernel-parameters.txt: update mem= option's spec according to its implementation 2012-12-17 17:15:12 -08:00
kmemcheck.txt
kmemleak.txt
kobject.txt Documentation: Fix "struct kobj_type" to include newer members. 2012-09-04 16:06:34 -07:00
kprobes.txt
kref.txt kref: Add kref_get_unless_zero documentation 2012-11-28 18:36:06 +10:00
ldm.txt
local_ops.txt
lockdep-design.txt
lockstat.txt
lockup-watchdogs.txt
logo.gif
logo.txt
magic-number.txt
md.txt
media-framework.txt [media] media: Add link_validate() op to check links to the sink pad 2012-05-14 08:44:11 -03:00
memory-barriers.txt Documentation: Fix memory-barriers.txt example 2012-10-23 14:44:46 -07:00
memory-hotplug.txt hotplug: update nodemasks management 2012-12-12 17:38:33 -08:00
mono.txt
mutex-design.txt
nommu-mmap.txt
numastat.txt
oops-tracing.txt
padata.txt
parport-lowlevel.txt
parport.txt
percpu-rw-semaphore.txt percpu-rw-semaphore: fix documentation typos 2012-09-26 19:56:15 +02:00
pi-futex.txt
pinctrl.txt gpiolib: provide provision to register pin ranges 2012-11-11 19:06:00 +01:00
pnp.txt
preempt-locking.txt
printk-formats.txt lib/vsprintf: update documentation to cover all of %p[Mm][FR] 2012-10-06 03:04:50 +09:00
pwm.txt pwm: add devm_pwm_get() and devm_pwm_put() 2012-09-10 17:05:45 +02:00
ramoops.txt pstore/ftrace: Convert to its own enable/disable debugfs knob 2012-09-06 22:16:58 -07:00
rbtree.txt rbtree: move augmented rbtree functionality to rbtree_augmented.h 2012-10-09 16:22:40 +09:00
remoteproc.txt remoteproc: add rproc_report_crash function to notify rproc crashes 2012-09-18 12:53:22 +03:00
rfkill.txt
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rt-mutex-design.txt
rt-mutex.txt
rtc.txt rtc-proc: permit the /proc/driver/rtc device to use other devices 2012-10-06 03:05:01 +09:00
serial-console.txt
sgi-ioc4.txt
sgi-visws.txt
smsc_ece1099.txt mfd: smsc: Add support for smsc gpio io/keypad driver 2012-10-01 15:27:48 +02:00
sparse.txt Documentation/sparse.txt: document context annotations for lock checking 2012-12-17 17:15:23 -08:00
spinlocks.txt
stable_api_nonsense.txt
stable_kernel_rules.txt stable: Allow merging of backports for serious user-visible performance issues 2012-06-25 12:11:58 -07:00
static-keys.txt Documentation: Fix typo in multiple files in Documentation 2012-04-16 14:37:13 +02:00
svga.txt
sysfs-rules.txt
sysrq.txt sparc64: Add global PMU register dumping via sysrq. 2012-10-16 09:34:01 -07:00
unaligned-memory-access.txt
unicode.txt
unshare.txt
vfio.txt vfio: Trivial Documentation correction 2012-09-21 10:48:03 -06:00
vgaarbiter.txt
video-output.txt
vme_api.txt VME: Move API documentation to Documentation folder 2012-05-08 16:01:34 -07:00
volatile-considered-harmful.txt
workqueue.txt workqueue: reimplement WQ_HIGHPRI using a separate worker_pool 2012-07-13 22:24:45 -07:00
xz.txt
zorro.txt