Commit Graph

60096 Commits (c9c64155f5a81b4b41e98f9fb9c464a565c1bf72)

Author SHA1 Message Date
Yasuyuki Kozakai d87d8469e2 [NETFILTER]: nf_conntrack: Increment error count on parsing IPv4 header
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:44:23 -07:00
Michael Chan 6460d948f3 [NET]: Add ethtool support for NETIF_F_IPV6_CSUM devices.
Add ethtool utility function to set or clear IPV6_CSUM feature flag.
Modify tg3.c and bnx2.c to use this function when doing ethtool -K
to change tx checksum.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 19:07:52 -07:00
Ursula Braun febca281f6 [AF_IUCV]: Add lock when updating accept_q
The accept_queue of an af_iucv socket will be corrupted, if
adding and deleting of entries in this queue occurs at the
same time (connect request from one client, while accept call
is processed for another client).
Solution: add locking when updating accept_q

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Acked-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 19:04:25 -07:00
Ursula Braun 13fdc9a74d [AF_IUCV]: Avoid deadlock between iucv_path_connect and tasklet.
An iucv deadlock may occur, where one CPU is spinning on the
iucv_table_lock for iucv_tasklet_fn(), while another CPU is holding
the iucv_table_lock for an iucv_path_connect() and is waiting for
the first CPU in an smp_call_function.
Solution: replace spin_lock in iucv_tasklet_fn by spin_trylock and
reschedule tasklet in case of non-granted lock.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Acked-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 19:03:41 -07:00
Jennifer Hunt da7de31cc5 [AF_IUCV]: Improve description of IUCV and AFIUCV configuration options.
Signed-off-by: Jennifer Hunt <jenhunt@us.ibm.com>
Signed-off-by: Ursula Braun >braunu@de.ibm.com>
Acked-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 19:03:00 -07:00
Adrian Bunk acd159b6b5 [INET_SOCK]: make net/ipv4/inet_timewait_sock.c:__inet_twsk_kill() static
This patch makes the needlessly global __inet_twsk_kill() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 19:00:59 -07:00
David S. Miller cf3842ec50 Merge branch 'upstream-davem' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2007-07-14 18:58:49 -07:00
Stephen Hemminger b3b0b681b1 [TCP]: tcp probe add back ssthresh field
Sangtae noticed the ssthresh got missed.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:57:19 -07:00
Patrick McHardy a7ecfc8665 [VLAN]: Fix memset length
Fix sizeof(ETH_ALEN) Introduced by my rtnl_link patches.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:56:30 -07:00
Patrick McHardy b863ceb7dd [NET]: Add macvlan driver
Add macvlan driver, which allows to create virtual ethernet devices
based on MAC address.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:55:06 -07:00
Patrick McHardy 56addd6eee [VLAN]: Use multicast list synchronization helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:53:28 -07:00
Patrick McHardy 6c78dcbd47 [VLAN]: Fix promiscous/allmulti synchronization races
The set_multicast_list function may be called without holding the rtnl
mutex, resulting in races when changing the underlying device's promiscous
and allmulti state. Use the change_rx_mode hook, which is always invoked
under the rtnl.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:52:56 -07:00
Patrick McHardy a0a400d79e [NET]: dev_mcast: add multicast list synchronization helpers
The method drivers currently use to synchronize multicast lists is not
very pretty:

- walk the multicast list
- search each entry on a copy of the previous list
- if new add to lower device
- walk the copy of the previous list
- search each entry on the current list
- if removed delete from lower device
- copy entire list

This patch adds a new field to struct dev_addr_list to store the
synchronization state and adds two helper functions for synchronization
and cleanup.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:52:02 -07:00
Patrick McHardy 24023451c8 [NET]: Add net_device change_rx_mode callback
Currently the set_multicast_list (and set_rx_mode) callbacks are
responsible for configuring the device according to the IFF_PROMISC,
IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in
case of set_rx_mode).

These callbacks can be invoked from BH context without the rtnl_mutex
by dev_mc_add/dev_mc_delete, which makes reading the device flags and
promiscous/allmulti count racy. For real hardware drivers that just
commit all changes to the hardware this is not a real problem since
the stack guarantees to call them for every change, so at least the
final call will not race and commit the correct configuration to the
hardware.

For software devices that want to synchronize promiscous and multicast
state to an underlying device however this can cause corruption of the
underlying device's flags or promisc/allmulti counts.

When the software device is concurrently put in promiscous or allmulti
mode while set_multicast_list is invoked from bottem half context, the
device might synchronize the change to the underlying device without
holding the rtnl_mutex, which races with concurrent changes to the
underlying device.

Add a dev->change_rx_flags hook that is invoked when any of the flags
that affect rx filtering change (under the rtnl_mutex), which allows
drivers to perform synchronization immediately and only synchronize
the address lists in set_multicast_list/set_rx_mode.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:51:31 -07:00
Ingo Molnar e6c9116d1d [RFKILL]: fix net/rfkill/rfkill-input.c bug on 64-bit systems
Subject: [patch] net/input: fix net/rfkill/rfkill-input.c bug on 64-bit systems

this recent commit:

 commit cf4328cd94
 Author: Ivo van Doorn <IvDoorn@gmail.com>
 Date:   Mon May 7 00:34:20 2007 -0700

     [NET]: rfkill: add support for input key to control wireless radio

added this 64-bit bug:

        ....
	unsigned int flags;
 
 	spin_lock_irqsave(&task->lock, flags);
        ....

irq 'flags' must be unsigned long, not unsigned int. The -rt tree has 
strict checks about this on 64-bit so this triggered a build failure. 

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:50:15 -07:00
Cornelia Huck 7689e82efd [SCSI] don't build scsi_dma_{map,unmap} for !HAS_DMA
With

 dma-mapping-prevent-dma-dependent-code-from-linking-on.patch

scsi fails to build on !HAS_DMA architectures:

drivers/built-in.o(.text+0x20af6): In function `scsi_dma_map':
: undefined reference to `dma_map_sg'
drivers/built-in.o(.text+0x20b5c): In function `scsi_dma_unmap':
: undefined reference to `dma_unmap_sg'

I split those functions out into a new file. Builds on s390 and i386.

Move scsi_dma_{map,unmap} into scsi_lib_dma.c which is only build if
HAS_DMA is set.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 19:28:10 -05:00
Matthew Wilcox 6d877688ef [SCSI] Clean up scsi_add_lun a bit
This patch tidies up scsi_add_lun a bit.  I rewrote the kerneldoc to match
the actual parameters, moved the check for RBC and MMC REPORT_LUN devices
away from the switch(), changed the setup of sdev->type to account for
BLIST_ISROM, moved the check for BLIST_NO_ULD_ATTACH further down in
the function, removed a bogus comment and fixed some whitespace issues.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 19:13:13 -05:00
Thomas Bogendoerfer 0cba35e42c [SCSI] 53c700: Remove printk, which triggers because of low scsi clock on SNI RMs
remove printk, which triggers because of low scsi clock on SNI RMs

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 19:12:43 -05:00
Thomas Bogendoerfer 2da8658910 [SCSI] sni_53c710: Cleanup
- base address is now a physical address; no need to convert it
- remove not needed error printk in module init function

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 19:12:15 -05:00
David C Somayajulu 6ea7e33ee1 [SCSI] qla4xxx: Fix underrun/overrun conditions
On Wed, 2007-06-06 at 11:55 -0700, David C Somayajulu wrote:
This patch fixes the code handling underrun and overrun conditions.
Also fixed coding style as per Mike Christie's advice.

Signed-off-by: David Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 19:11:38 -05:00
Matthias Kaehlcke 0c2cc43379 [SCSI] megaraid_mbox: use mutex instead of semaphore
The Megaraid Mailbox driver uses a semaphore as mutex.  Use the mutex API
instead of the (binary) semaphore.

Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Acked-by: "Patro, Sumant" <Sumant.Patro@lsi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 19:10:19 -05:00
Salyzyn, Mark 5fa0f5e47a [SCSI] aacraid: add 51245, 51645 and 52245 adapters to documentation.
Adding Adaptec 51245 (16 port), 51645 (20 port) and 52445 (28 port)
Universal Serial RAID controllers to the aacraid documentation.

Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 19:09:28 -05:00
Seokmann Ju cad7d7858b [SCSI] qla2xxx: update version to 8.02.00-k1.
Following patch bump up the driver version reflecting NPIV addition to
the qla2xxx.

- version changed from 8.01.07-k7 to 8.02.00-k1.

Signed-off-by: Seokmann Ju <seokmann.ju@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 19:08:13 -05:00
Seokmann Ju 2c3dfe3f6a [SCSI] qla2xxx: add support for NPIV
Following patch adds support for NPIV (N-Port ID Virtualization) to the
qla2xxx.

- supported within switched-fabric topologies only.
- supports up to 63 virtual ports on each physical port.

Signed-off-by: Seokmann Ju <seokmann.ju@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 19:08:05 -05:00
Ed Lin 968a5763fb [SCSI] stex: use resid for xfer len information
The original implementation in stex_ys_commands() is inappropriate.
For xfer len information, we should use resid instead.

Signed-off-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 18:59:10 -05:00
Matthew Wilcox 80dc3e062a [SCSI] Add Brownie 1200U3P to blacklist
The Brownie 1200U3P has the same problem with REPORT LUNS as the
1600U3P.  Add it to the blacklist.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 18:58:37 -05:00
Boaz Harrosh a73e45b3da [SCSI] scsi.c: convert to use the data buffer accessors
- a couple of prints, they can use the accessors

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 18:57:54 -05:00
Boaz Harrosh 0ab179bcf3 [SCSI] tmscsim: Further clean-up of the driver
- The saved sg_count was a leftover from the time the driver was doing
   dma mapping by himself. But now that scsi-ml is called for the mapping
   it is not the drivers responsibility.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: G. Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 18:56:33 -05:00
Geert Uytterhoeven cde760856c [SCSI] CONFIG_SCSI_FD_8xx no longer exists
CONFIG_SCSI_FD_8xx no longer exists.

Apparently it was renamed to CONFIG_SCSI_SEAGATE, but the Makefile was
not correctly updated.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 18:55:11 -05:00
Akinobu Mita da3962fe63 [SCSI] sr: fix error handling in module_init
Sweep registered blkdev when scsi_register_driver has failed.

Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 18:54:34 -05:00
James Bottomley a57850379e [SCSI] lpfc: Fix NPIV compile problem
drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_create_port':
drivers/scsi/lpfc/lpfc_init.c:1573: error: 'struct kobject' has no member named 'dentry'

Just remove the if check on this ... lpfc shouldn't be poking around
in kobject structures.

drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_pci_probe_one':
drivers/scsi/lpfc/lpfc_init.c:1723: warning: unused variable 'retval'

And remove the unused variable.

Cc: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 18:47:04 -05:00
FUJITA Tomonori c59fd9ebc4 [SCSI] lpfc: fix NPIV mapping problems
This patch uses dma_map_sg with phba->pcidev->dev instead of
scsi_dma_map.

scsi_dma_map doesn't work for NPIV since fc_vport->dev isn't fully
initialized. check_addr() in arch/x86_64/kernel/pci-nommu.c leads to
the crash since dev->dma_mask is NULL.

For more details:

http://marc.info/?l=linux-scsi&m=118312448030633&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 17:13:02 -05:00
Boaz Harrosh d4bd4cd063 [SCSI] lpfc: add missed data buffer accessor
This is an addendum to:

 commit a0b4f78f9a
 Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
    [SCSI] lpfc: convert to use the data buffer accessors

One place was missed in the merge

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: James Smart <James.Smart@Emulex.Com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 17:11:16 -05:00
Priyanka Gupta d0f656cad3 [SCSI] Remove unused method scsi_device_cancel
Removes an obsolete method scsi_device_cancel which isn't being used
anywhere in the kernel.

Signed-off-by: Priyanka Gupta <priyankag@google.com>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-14 16:01:16 -05:00
Eric Van Hensbergen 0af8887ebf 9p: fix a race condition bug in umount which caused a segfault
umounting partitions after heavy activity would sometimes trigger a
segmentation violation.  This fix appears to remove that problem.
Fix originally provided by Latchesar Ionkov.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-07-14 15:14:19 -05:00
Eric Van Hensbergen 9e2f6688c0 9p: re-enable mount time debug option
During reorganization, the mount time debug option was removed in favor
of module-load-time parameters.  However, the mount time option is still
a useful for feature during debug and for user-fault isolation when the
module is compiled into the kernel.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-07-14 15:14:14 -05:00
Eric Van Hensbergen 9523a841b1 9p: cache meta-data when cache=loose
This patch expands the impact of the loose cache mode to allow for cached
metadata increasing the performance of directory listings and other metadata
read operations.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-07-14 15:14:08 -05:00
Latchesar Ionkov 1d6b560238 net/9p: set error to EREMOTEIO if trans->write returns zero
If trans->write returns 0, p9_write_work goes through the error path, but
sets the error code to zero.

This patch sets the error code to EREMOTEIO if trans->write returns zero
value.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
2007-07-14 15:14:01 -05:00
Latchesar Ionkov e46662be7f net/9p: change net/9p module name to 9pnet
Change module name of net/9p module from 9p.ko to 9pnet.ko. fs/9p module
already uses 9p.ko name.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
2007-07-14 15:13:50 -05:00
Latchesar Ionkov bd238fb431 9p: Reorganization of 9p file system code
This patchset moves non-filesystem interfaces of v9fs from fs/9p to net/9p.
It moves the transport, packet marshalling and connection layers to net/9p
leaving only the VFS related files in fs/9p.  This work is being done in
preparation for in-kernel 9p servers as well as alternate 9p clients (other
than VFS).

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-07-14 15:13:40 -05:00
David Chinner 0f1145cc18 [XFS] Fix lockdep annotations for xfs_lock_inodes
SGI-PV: 967035
SGI-Modid: xfs-linux-melb:xfs-kern:29026a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-14 18:09:42 +10:00
David Chinner d7f0923d83 [LIB]: export radix_tree_preload()
XFS filestreams functionality uses radix trees and the preload
functions. XFS can be built as a module and hence we need
radix_tree_preload() exported. radix_tree_preload_end() is a
static inline, so it doesn't need exporting.

Signed-Off-By: Dave Chinner <dgc@sgi.com>
Signed-Off-By: Tim Shimmin <tes@sgi.com>
2007-07-14 16:05:04 +10:00
Michal Marek faa63e9584 [XFS] Fix XFS_IOC_FSBULKSTAT{,_SINGLE} & XFS_IOC_FSINUMBERS in compat mode
* 32bit struct xfs_fsop_bulkreq has different size and layout of
members, no matter the alignment. Move the code out of the #else
branch (why was it there in the first place?). Define _32 variants of
the ioctl constants.
* 32bit struct xfs_bstat is different because of time_t and on
i386 because of different padding. Make xfs_bulkstat_one() accept a
custom "output formatter" in the private_data argument which takes care
of the xfs_bulkstat_one_compat() that takes care of the different
layout in the compat case.
* i386 struct xfs_inogrp has different padding.
Add a similar "output formatter" mecanism to xfs_inumbers().

SGI-PV: 967354
SGI-Modid: xfs-linux-melb:xfs-kern:29102a

Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-14 15:42:50 +10:00
Michal Marek 1fa503df66 [XFS] Compat ioctl handler for handle operations
32bit struct xfs_fsop_handlereq has different size and offsets (due to
pointers). TODO: case XFS_IOC_{FSSETDM,ATTRLIST,ATTRMULTI}_BY_HANDLE still
not handled.

SGI-PV: 967354
SGI-Modid: xfs-linux-melb:xfs-kern:29101a

Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-14 15:41:49 +10:00
Michal Marek 547e00c3c6 [XFS] Compat ioctl handler for XFS_IOC_FSGEOMETRY_V1.
i386 struct xfs_fsop_geom_v1 has no padding after the last member, so the
size is different.

SGI-PV: 967354
SGI-Modid: xfs-linux-melb:xfs-kern:29100a

Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-14 15:41:39 +10:00
Eric Sandeen 3a59c94c4b [XFS] Clean up function name handling in tracing code
Remove the hardcoded "fnames" for tracing, and just embed them in tracing
macros via __FUNCTION__. Kills a lot of #ifdefs too.

SGI-PV: 967353
SGI-Modid: xfs-linux-melb:xfs-kern:29099a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-14 15:41:24 +10:00
David Chinner b11f94d537 [XFS] Quota inode has no parent.
Avoid using a special "zero inode" as the parent of the quota inode as
this can confuse the filestreams code into thinking the quota inode has a
parent. We do not want the quota inode to follow filestreams allocation
rules, so pass a NULL as the parent inode and detect this condition when
doing stream associations.

SGI-PV: 964469
SGI-Modid: xfs-linux-melb:xfs-kern:29098a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-14 15:41:12 +10:00
David Chinner 2a82b8be8a [XFS] Concurrent Multi-File Data Streams
In media spaces, video is often stored in a frame-per-file format. When
dealing with uncompressed realtime HD video streams in this format, it is
crucial that files do not get fragmented and that multiple files a placed
contiguously on disk.

When multiple streams are being ingested and played out at the same time,
it is critical that the filesystem does not cross the streams and
interleave them together as this creates seek and readahead cache miss
latency and prevents both ingest and playout from meeting frame rate
targets.

This patch set creates a "stream of files" concept into the allocator to
place all the data from a single stream contiguously on disk so that RAID
array readahead can be used effectively. Each additional stream gets
placed in different allocation groups within the filesystem, thereby
ensuring that we don't cross any streams. When an AG fills up, we select a
new AG for the stream that is not in use.

The core of the functionality is the stream tracking - each inode that we
create in a directory needs to be associated with the directories' stream.
Hence every time we create a file, we look up the directories' stream
object and associate the new file with that object.

Once we have a stream object for a file, we use the AG that the stream
object point to for allocations. If we can't allocate in that AG (e.g. it
is full) we move the entire stream to another AG. Other inodes in the same
stream are moved to the new AG on their next allocation (i.e. lazy
update).

Stream objects are kept in a cache and hold a reference on the inode.
Hence the inode cannot be reclaimed while there is an outstanding stream
reference. This means that on unlink we need to remove the stream
association and we also need to flush all the associations on certain
events that want to reclaim all unreferenced inodes (e.g. filesystem
freeze).

SGI-PV: 964469
SGI-Modid: xfs-linux-melb:xfs-kern:29096a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Vlad Apostolov <vapo@sgi.com>
2007-07-14 15:40:53 +10:00
Andrew Morton 0892ccd6fe [XFS] Use uninitialized_var macro to stop warning about rtx
Appease gcc in regards to "warning: 'rtx' is used uninitialized in
this function".

SGI-PV: 907752
SGI-Modid: xfs-linux-melb:xfs-kern:29007a

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-14 15:40:02 +10:00
Christoph Hellwig fbf3ce8d8e [XFS] XFS should not be looking at filp reference counts
A check for file_count is always a bad idea. Linux has the ->release
method to deal with cleanups on last close and ->flush is only for the
very rare case where we want to perform an operation on every drop of a
reference to a file struct.

This patch gets rid of vop_close and surrounding code in favour of simply
doing the page flushing from ->release.

SGI-PV: 966562
SGI-Modid: xfs-linux-melb:xfs-kern:28952a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-14 15:37:37 +10:00