Commit Graph

31613 Commits (ba39592764ed20cee09aae5352e603a27bf56b0d)

Author SHA1 Message Date
Keshavamurthy, Anil S ba39592764 Intel IOMMU: Intel IOMMU driver
Actual intel IOMMU driver.  Hardware spec can be found at:
http://www.intel.com/technology/virtualization

This driver sets X86_64 'dma_ops', so hook into standard DMA APIs.  In this
way, PCI driver will get virtual DMA address.  This change is transparent to
PCI drivers.

[akpm@linux-foundation.org: remove unneeded cast]
[akpm@linux-foundation.org: build fix]
[bunk@stusta.de: fix duplicate CONFIG_DMAR Makefile line]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Keshavamurthy, Anil S f8de50eb6b Intel IOMMU: IOVA allocation and management routines
This code implements a generic IOVA allocation and management.  As per Dave's
suggestion we are now allocating IO virtual address from Higher DMA limit
address rather than lower end address and this eliminated the need to preserve
the IO virtual address for multiple devices sharing the same domain virtual
address.

Also this code uses red black trees to store the allocated and reserved iova
nodes.  This showed a good performance improvements over previous linear
linked list.

[akpm@linux-foundation.org: remove inlines]
[akpm@linux-foundation.org: coding style fixes]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Keshavamurthy, Anil S 994a65e25d Intel IOMMU: PCI generic helper function
When devices are under a p2p bridge, upstream transactions get replaced by the
device id of the bridge as it owns the PCIE transaction.  Hence its necessary
to setup translations on behalf of the bridge as well.  Due to this limitation
all devices under a p2p share the same domain in a DMAR.

We just cache the type of device, if its a native PCIe device
or not for later use.

[akpm@linux-foundation.org: BUG_ON -> WARN_ON+recover]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Keshavamurthy, Anil S 10e5247f40 Intel IOMMU: DMAR detection and parsing logic
This patch supports the upcomming Intel IOMMU hardware a.k.a.  Intel(R)
Virtualization Technology for Directed I/O Architecture and the hardware spec
for the same can be found here
http://www.intel.com/technology/virtualization/index.htm

FAQ! (questions from akpm, answers from ak)

> So...  what's all this code for?
>
> I assume that the intent here is to speed things up under Xen, etc?

Yes in some cases, but not this code.  That would be the Xen version of this
code that could potentially assign whole devices to guests.  I expect this to
be only useful in some special cases though because most hardware is not
virtualizable and you typically want an own instance for each guest.

Ok at some point KVM might implement this too; i likely would use this code
for this.

> Do we
> have any benchmark results to help us to decide whether a merge would be
> justified?

The main advantage for doing it in the normal kernel is not performance, but
more safety.  Broken devices won't be able to corrupt memory by doing random
DMA.

Unfortunately that doesn't work for graphics yet, for that need user space
interfaces for the X server are needed.

There are some potential performance benefits too:

- When you have a device that cannot address the complete address range an
  IOMMU can remap its memory instead of bounce buffering.  Remapping is likely
  cheaper than copying.

- The IOMMU can merge sg lists into a single virtual block.  This could
  potentially speed up SG IO when the device is slow walking SG lists.  [I
  long ago benchmarked 5% on some block benchmark with an old MPT Fusion; but
  it probably depends a lot on the HBA]

And you get better driver debugging because unexpected memory accesses from
the devices will cause a trappable event.

>
> Does it slow anything down?

It adds more overhead to each IO so yes.

This patch:

Add support for early detection and parsing of DMAR's (DMA Remapping) reported
to OS via ACPI tables.

DMA remapping(DMAR) devices support enables independent address translations
for Direct Memory Access(DMA) from Devices.  These DMA remapping devices are
reported via ACPI tables and includes pci device scope covered by these DMA
remapping device.

For detailed info on the specification of "Intel(R) Virtualization Technology
for Directed I/O Architecture" please see
http://www.intel.com/technology/virtualization/index.htm

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Yasunori Goto 7b78d335ac memory hotplug: rearrange memory hotplug notifier
Current memory notifier has some defects yet.  (Fortunately, nothing uses
it.) This patch is to fix and rearrange for them.

  - Add information of start_pfn, nr_pages, and node id if node status is
    changes from/to memoryless node for callback functions.
    Callbacks can't do anything without those information.
  - Add notification going-online status.
    It is necessary for creating per node structure before the node's
    pages are available.
  - Move GOING_OFFLINE status notification after page isolation.
    It is good place for return memory like cache for callback,
    because returned page is not used again.
  - Make CANCEL events for rollingback when error occurs.
  - Delete MEM_MAPPING_INVALID notification. It will be not used.
  - Fix compile error of (un)register_memory_notifier().

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:17 -07:00
Mike Frysinger eaa854902a Blackfin serial driver Kconfig: depend on DMA not being enabled rather than a specific DMA size
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
2007-10-21 22:30:01 +08:00
Linus Torvalds cfa76f024f Merge branch 'master' of hera.kernel.org:/pub/scm/linux/kernel/git/kyle/parisc-2.6
* 'master' of hera.kernel.org:/pub/scm/linux/kernel/git/kyle/parisc-2.6: (29 commits)
  [PARISC] fix uninitialized variable warning in asm/rtc.h
  [PARISC] Port checkstack.pl to parisc
  [PARISC] Make palo target work when $obj != $src
  [PARISC] Zap unused variable warnings in pci.c
  [PARISC] Fix tests in palo target
  [PARISC] Fix palo target
  [PARISC] Restore palo target
  [PARISC] Attempt to clean up parisc/Makefile
  [PARISC] Fix infinite loop in /proc/iomem
  [PARISC] Quiet sysfs_create_link __must_check warnings in pdc_stable
  [PARISC] Squelch pci_enable_device __must_check warning in superio
  [PARISC] Kill off broken irqstack code
  [PARISC] Remove hardcoded uses of PAGE_SIZE
  [PARISC] Clean up pointless ASM_PAGE_SIZE_DIV use
  [PARISC] Kill off the last vestiges of ASM_PAGE_SIZE
  [PARISC] Kill off ASM_PAGE_SIZE use
  [PARISC] Beautify parisc vmlinux.lds.S
  [PARISC] Clean up a resource_size_t warning in sba_iommu
  [PARISC] Kill incorrect cast warning in unwinder
  [PARISC] Kill zone_to_nid printk warning
  ...

Fixed trivial conflict in include/asm-parisc/tlbflush.h manually
2007-10-20 20:19:15 -07:00
Al Viro 8add24413d vfc_dev conversion to mutex: fallout
Commit 7b96dc023a ("[SPARC] Videopix Frame
Grabber: Convert device_lock_sem to mutex") missed one place.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-20 15:04:06 -07:00
Linus Torvalds c00046c279 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (74 commits)
  fix do_sys_open() prototype
  sysfs: trivial: fix sysfs_create_file kerneldoc spelling mistake
  Documentation: Fix typo in SubmitChecklist.
  Typo: depricated -> deprecated
  Add missing profile=kvm option to Documentation/kernel-parameters.txt
  fix typo about TBI in e1000 comment
  proc.txt: Add /proc/stat field
  small documentation fixes
  Fix compiler warning in smount example program from sharedsubtree.txt
  docs/sysfs: add missing word to sysfs attribute explanation
  documentation/ext3: grammar fixes
  Documentation/java.txt: typo and grammar fixes
  Documentation/filesystems/vfs.txt: typo fix
  include/asm-*/system.h: remove unused set_rmb(), set_wmb() macros
  trivial copy_data_pages() tidy up
  Fix typo in arch/x86/kernel/tsc_32.c
  file link fix for Pegasus USB net driver help
  remove unused return within void return function
  Typo fixes retrun -> return
  x86 hpet.h: remove broken links
  ...
2007-10-19 20:36:17 -07:00
Linus Torvalds 9abbf7d028 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (21 commits)
  Fix build break in tsi108.c
  qeth: remove header_ops bug
  ir-functions.c:(.text+0xbce18): undefined reference to `input_event'
  NAPI: kconfig prompt and deleted doc file
  phy/bitbang: missing MODULE_LICENSE
  DM9000 initialization fix
  [PATCH] rt2x00: Add new rt73usb USB ID
  [PATCH] rt2x00: Fix residual check in PLCP calculations.
  [PATCH] iwlwifi: Fix rate setting in probe request for HW sacn
  [PATCH] b43: Make b43_stop() static
  [PATCH] drivers/net/wireless/b43/main.c: fix an uninitialized variable
  [PATCH] iwlwifi: set correct base rate for A band in rs_dbgfs_set_mcs
  [PATCH] zd1211rw, fix oops when ejecting install media
  [PATCH] b43legacy: Fix potential return of uninitialized variable
  [PATCH] iwl4965-base.c: fix off-by-one errors
  [PATCH] p54: Make filter configuration atomic
  [PATCH] rtl8187: remove NICMAC setting in configure_filters callback
  [PATCH] janitorial: fix all double includes in drivers/net/wireless
  [PATCH] rtl8187: Fix more frag bit checking, rts duration calc
  [PATCH] ipw2100: send WEXT scan events
  ...
2007-10-19 20:35:20 -07:00
Linus Torvalds b3d9d6be03 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] sata_sis: use correct S/G table size
  pata_cs5536: MWDMA fix
  sata_sis: fix SCR read breakage
  libata: fix kernel-doc param name
2007-10-19 20:34:29 -07:00
Jeff Garzik fe2520094d Merge branch 'fixes-jgarzik' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream 2007-10-19 23:04:50 -04:00
Olof Johansson c9b2ca735a Fix build break in tsi108.c
Fix build break:

drivers/net/tsi108_eth.c: In function 'tsi108_init_one':
drivers/net/tsi108_eth.c:1633: error: expected ')' before 'dev'
drivers/net/tsi108_eth.c:1633: warning: too few arguments for format
make[2]: *** [drivers/net/tsi108_eth.o] Error 1

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-19 23:00:03 -04:00
Ursula Braun 224426f168 qeth: remove header_ops bug
Remove qeth bug caused by commit:
[NET]: Move hardware header operations out of netdevice.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-19 23:00:02 -04:00
Randy Dunlap e0d5dab24d ir-functions.c:(.text+0xbce18): undefined reference to `input_event'
[bugme-daemon@bugzilla.kernel.org wrote:]

From: Randy Dunlap <randy.dunlap@oracle.com>

Drivers that use lro functions should depend on INET, otherwise they
may not link correctly.  Let's not select INET.  Select should be used
only for library-like code, not to enable subsystems.

ERROR: "lro_flush_all" [drivers/net/myri10ge/myri10ge.ko] undefined!
ERROR: "lro_receive_frags" [drivers/net/myri10ge/myri10ge.ko] undefined!

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-19 23:00:02 -04:00
Randy Dunlap bf45abeb1d NAPI: kconfig prompt and deleted doc file
- make the kconfig NAPI option prompt consistent across all net drivers
  (other than EXPERIMENTAL; can it now be removed also, or is the new
  napi_struct implementation now EXPERIMENTAL ?)
- remove comment about the now-deleted NAPI_HOWTO.txt file
- clean up typos in Tulip NAPI & Interrupt Mitigation

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-19 23:00:02 -04:00
Randy Dunlap 5a46236d20 phy/bitbang: missing MODULE_LICENSE
Missing MODULE_LICENSE(), loading this module taints the kernel.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-19 23:00:01 -04:00
Mike Rapoport 418d6f871b DM9000 initialization fix
DM9000 driver returns success even if it is failed to detect the chip.
Below patch fixes it.

Signed-off-by: Mike Rapoport <mike@compulab.co.il>

 drivers/net/dm9000.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-19 23:00:01 -04:00
Jeff Garzik 96af154710 [libata] sata_sis: use correct S/G table size
sata_sis has the same restrictions as other SFF controllers, and so must
use LIBATA_MAX_PRD to denote that SCSI may only fill ATA_MAX_PRD/2
entries, due to our need to handle IOMMU merging.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-19 22:56:44 -04:00
Bartlomiej Zolnierkiewicz 80f6fd3828 pata_cs5536: MWDMA fix
* Fix out-of-bound array access for MWDMA modes.

* Bump driver version.

Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-19 22:55:03 -04:00
Tejun Heo aaa092a114 sata_sis: fix SCR read breakage
SCR read for controllers which uses PCI configuration space for SCR
access got broken while adding @val argument to SCR accessors.  Fix
it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-19 22:55:02 -04:00
Randy Dunlap 5c1ad8b305 libata: fix kernel-doc param name
Fix libata kernel-doc parameter name.

Warning(linux-2.6.23-git13//drivers/ata/libata-core.c:1415): No description found for parameter 'sgl'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-19 22:55:02 -04:00
Linus Torvalds 5f737085be Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6: (50 commits)
  ide: remove inclusion of non-existent io_trace.h
  ide-disk: add get_smart_data() helper
  ide: fix ->data_phase in taskfile_load_raw()
  ide: check drive->using_dma in flagged_taskfile()
  ide: check ->dma_setup() return value in flagged_taskfile()
  dtc2278: note on docs
  qd65xx: remove pointless qd_{read,write}_reg() (take 2)
  ide: PCI BMDMA initialization fixes (take 2)
  ide: remove stale comments from ide-taskfile.c
  ide: remove dead code from ide_driveid_update()
  ide: use __ide_end_request() in ide_end_dequeued_request()
  ide: enhance ide_setup_pci_noise()
  cs5530: remove needless ide_lock taking
  ide: take ide_lock for prefetch disable/enable in do_special()
  ht6560b: fix deadlock on error handling
  cmd640: fix deadlock on error handling
  slc90e66: fix deadlock on error handling
  opti621: fix deadlock on error handling
  qd65xx: fix deadlock on error handling
  dtc2278: fix deadlock on error handling
  ...
2007-10-19 19:36:05 -07:00
Rolf Eike Beer 405bbe9fa3 Typo: depricated -> deprecated
Typo: depricated -> deprecated

Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-20 03:10:57 +02:00
Masatake YAMATO 828d055fd0 fix typo about TBI in e1000 comment
Signed-off-by: Masatake YAMATO <jet@gyve.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-20 03:06:37 +02:00
Milan Broz 80fd662683 dm crypt: tidy pending
Add crypt prefix to dec_pending to avoid confusing it in backtraces with
the dm core function of the same name.

No functional change here.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:28 +01:00
Mike Anderson b15546f942 dm mpath: send uevents
This patch adds calls to dm_path_event for a failed path and a reinstated
path.

Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:27 +01:00
Mike Anderson 7a8c3d3b92 dm: uevent generate events
This patch adds support for the dm_path_event dm_send_event functions which
create and send udev events.

Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:26 +01:00
Mike Anderson 51e5b2bd34 dm: add uevent to core
This patch adds a uevent skeleton to device-mapper.

Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:24 +01:00
Mike Anderson 96a1f7dba6 dm: export name and uuid
This patch adds a function to obtain a copy of a mapped device's name and uuid.

Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:23 +01:00
Jonathan Brassow aa5617c553 dm raid1: add mirror_set to struct mirror
Store a pointer to the owning mirror_set structure within each mirror
structure for a subsequent patch to use.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:22 +01:00
Jonathan Brassow 6b3df0d7a5 dm log: split suspend
There are now two phases to a suspend in device-mapper -
presuspend and postsuspend.  This patch removes the
single 'suspend' in the logging API and replaces it with
'presuspend' and 'postsuspend' functions to align it
better with core device-mapper.

A subsequent patch will make use of 'presuspend'.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:21 +01:00
Dave Wysochanski fe97e2aa05 dm mpath: hp retry if not ready
This patch adds retries to the hp hardware handler, and utilizes the
MP_RETRY flag of dm-multipath.  For now in the hp handler, if we get a
pg_init completed with a check condition we just assume we can retry the
pg_init command.  We make this assumption because of incomplete data on
specific check condition code of the HP hardware, and because testing
has shown the HP path initialization command to be idempotent.
The number of times we retry is settable via the "pg_init_retries"
multipath map feature.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:20 +01:00
Dave Wysochanski 16ebbf3584 dm mpath: add hp handler
This patch adds the most basic dm-multipath hardware support for the
HP active/passive arrays.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:19 +01:00
Dave Wysochanski c9e45581ad dm mpath: add retry pg init
This patch allows a failed path group initialisation command to be retried.

It adds a generic MP_RETRY flag and a "pg_init_retries" feature to
device-mapper multipath which limits the number of retries.

1. A hw handler sends a path initialization command to the storage and
the command completes with an error code indicating the command
should be retried.

2. The hardware handler calls dm_pg_init_complete() with MP_RETRY
set in err_flags to ask the dm multipath core to retry.

3. If the retry limit has not been exceeded, pg_init() is retried.
Otherwise fail_path() is called.

If you are using the userspace multipath-tools or device-mapper-multipath
package, you can set pg_init_retries in the 'device' section of your
/etc/multipath.conf file. For example:

features                "2 pg_init_retries 7"

The number of PG retries attempted is reported in the 'dmsetup status' output.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:18 +01:00
Milan Broz 636d5786c4 dm crypt: tidy labels
Replace numbers with names in labels in error paths, to avoid confusion
when new one get added between existing ones.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:17 +01:00
Milan Broz d469f84197 dm crypt: tidy whitespace
Clean up, convert some spaces to tabs.

No functional change here.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:15 +01:00
Milan Broz cabf08e4d3 dm crypt: add post processing queue
Add post-processing queue (per crypt device) for read operations.

Current implementation uses only one queue for all operations
and this can lead to starvation caused by many requests waiting
for memory allocation. But the needed memory-releasing operation
is queued after these requests (in the same queue).

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:14 +01:00
Milan Broz 9934a8bea2 dm crypt: use per device singlethread workqueues
Use a separate single-threaded workqueue for each crypt device
instead of one global workqueue.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:13 +01:00
Alasdair G Kergon d336416ff1 dm mpath: emc fix an error message
Correct an error message, reported by Michael Wood <michael@frogfoot.com>.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:12 +01:00
Alasdair G Kergon 051814c69f dm: bio_list macro renaming
Remove BIO_LIST and DEFINE_BIO_LIST macros that gain us nothing
since contents are initialised to NULL.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:11 +01:00
Jesper Juhl bb56acf840 dm io:ctl remove vmalloc void cast
In drivers/md/dm-ioctl.c::copy_params() there's a call to vmalloc()
where we currently cast the return value, but that's pretty pointless
given that vmalloc() returns "void *".

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:10 +01:00
Milan Broz 9e4e5f87eb dm: tidy bio_io_error usage
Use bio_io_error() in only two places and tidy the code,
preparing for later patches.

There is no functional change in this patch.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:09 +01:00
Matthias Kaehlcke def5b5b26e kcopyd use mutex instead of semaphore
Kcopyd uses a semaphore as mutex.  Use the mutex API instead of the (binary)
semaphore,

Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-10-20 02:01:08 +01:00
Dmitry Monakhov 094262db9e dm: use kzalloc
Convert kmalloc() + memset() to kzalloc().

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:07 +01:00
vignesh babu 6f3c3f0afa dm: use is_power_of_2
Replacing n & (n - 1) for power of 2 check by is_power_of_2(n)

Signed-off-by: vignesh babu <vignesh.babu@wipro.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:06 +01:00
Jun'ichi Nomura ae9da83f6d dm: fix thaw_bdev
This patch fixes a bd_mount_sem counter corruption bug in device-mapper.

thaw_bdev() should be called only when freeze_bdev() was called for the
device.
Otherwise, thaw_bdev() will up bd_mount_sem and corrupt the semaphore counter.
struct block_device with the corrupted semaphore may remain in slab cache
and be reused later.

Attached patch will fix it by calling unlock_fs() instead.
unlock_fs() will determine whether it should call thaw_bdev()
by checking the device is frozen or not.

Easy reproducer is:
  #!/bin/sh
  while [ 1 ]; do
     dmsetup --notable create a
     dmsetup --nolockfs suspend a
     dmsetup remove a
  done

It's not easy to see the effect of corrupted semaphore.
So I have tested with putting printk below in bdev_alloc_inode():
        if (atomic_read(&ei->bdev.bd_mount_sem.count) != 1)
                printk(KERN_DEBUG "Incorrect semaphore count = %d (%p)\n",
                        atomic_read(&ei->bdev.bd_mount_sem.count),
                        &ei->bdev);

Without the patch, I saw something like:
 Incorrect semaphore count = 17 (f2ab91c0)

With the patch, the message didn't appear.

The bug was introduced in 2.6.16 with this bug fix:

commit d9dde59ba0
Date:   Fri Feb 24 13:04:24 2006 -0800

    [PATCH] dm: missing bdput/thaw_bdev at removal

    Need to unfreeze and release bdev otherwise the bdev inode with
    inconsistent state is reused later and cause problem.

and backported to 2.6.15.5.

It occurs only in free_dev(), which is called only when the dm device is
removed.  The buggy code is executed only if md->suspended_bdev is
non-NULL and that can happen only when the device was suspended without
noflush.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@kernel.org
2007-10-20 02:01:05 +01:00
Milan Broz 79662d1ea3 dm delay: fix status
Fix missing space in dm-delay target status output
if separate read and write delay are configured.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:04 +01:00
Dmitry Monakhov 2e64a0f928 dm delay: fix ctr error paths
Add missing 'dm_put_device' to dm-delay target constructor.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:02 +01:00
Dmitry Monakhov a72cf737e0 dm raid1: fix leakage
Add missing 'dm_io_client_destroy' to alloc_context error path.
Reorganize mirror constructor error path in order to prevent
workqueue leakage.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20 02:01:01 +01:00