Commit Graph

1833 Commits (329ad399a9b3adf52c90637b21ca029fcf7f8795)

Author SHA1 Message Date
Steve Wise d4f1a5c6ef RDMA/cxgb4: Use correct control txq
There is only one control txq per tx channel.  So use the port number
as the queue index when sending.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-02 21:06:12 -07:00
Steve Wise 73d6fcad2a RDMA/cxgb4: Fix race in fini path
There exists a race condition where the app disconnects, which
initiates an orderly close (via rdma_fini()), concurrently with an
ingress abort condition, which initiates an abortive close operation.
Since rdma_fini() must be called without IRQs disabled, the fini can
be called after the QP has been transitioned to ERROR.  This is ok,
but we need to protect against qp->ep getting NULLed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-02 21:06:06 -07:00
Faisal Latif cd6860eb03 RDMA/nes: Fix hangs on ifdown
When ib_unregister_device() is called from netdev stop during ifdown,
it sometimes hangs. Changes made to indicate port_err to ib_dispatch_event()
during netdev stop and port_active during netdev open. The
ib_unregister_device() is only called during remove of the module.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-28 15:14:27 -07:00
Chien Tung 0eec495ee6 RDMA/nes: Store and print eeprom version
Read and print eeprom version and save it off for later use.
Also delete a tab.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-28 15:12:38 -07:00
Peter Huewe 33085bb8da RDMA/nes: Convert pci_table entries to PCI_VDEVICE
This patch converts pci_table entries, where .subvendor=PCI_ANY_ID and
.subdevice=PCI_ANY_ID, .class=0 and .class_mask=0, to use the
PCI_VDEVICE macro, and thus improves readability.

Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-28 10:39:33 -07:00
Alexander Schmidt e675b6db12 IB/ehca: Catch failing ioremap()
When ioremap() fails with a NULL pointer, catch the error and pass it
to the caller of create_qp() or create_cq() instead of trying to
dereference the NULL pointer later on.

Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 12:46:29 -07:00
Dave Olson bdf8edcb57 IB/qib: Allow PSM to select from multiple port assignment algorithms
We used to allow only full specification, or using all contexts within
an HCA before moving to the next HCA.  We now allow an additional
method -- round-robining through HCAs -- and make that the default.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 11:39:36 -07:00
Ralph Campbell 2d978a953b IB/qib: Turn off IB latency mode
Turn off IB latency mode. This improves link quality for slower
process chips.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 11:39:31 -07:00
Arnd Bergmann dd378c2102 IB/qib: Use generic_file_llseek
When the default llseek action gets changed to no_llseek, all file
systems relying on the current behaviour need to set explicit .llseek
operations.

In case of qib_fs, we want the files to be seekable, so
generic_file_llseek fits best.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 11:39:27 -07:00
Steve Wise d37ac31ddc RDMA/cxgb4: Support variable sized work requests
T4 EQ entries are in multiples of 64 bytes.  Currently the RDMA SQ and
RQ use fixed sized entries composed of 4 EQ entries for the SQ and 2
EQ entries for the RQ.  For optimial latency with small IO, we need to
change this so the HW only needs to DMA the EQ entries actually used
by a given work request.

Implementation:

- add wq_pidx counter to track where we are in the EQ.  cidx/pidx are
  used for the sw sq/rq tracking and flow control.

- the variable part of work requests is the SGL.  Add new functions to
  build the SGL and/or immediate data directly in the EQ memory
  wrapping when needed.

- adjust the min burst size for the EQ contexts to 64B.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 11:16:20 -07:00
Dan Carpenter 3d4f9a28e0 RDMA/cxgb3: Clean up signed check of unsigned variable
Q_FREECNT() returns the number of spaces free.  This should never be a
negative amount.  Also the num_wrs is an unsigned int so it can never
be less than zero.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 10:57:25 -07:00
David Rientjes d3c814e8b2 RDMA/cxgb4: Remove dependency on __GFP_NOFAIL
The alloc_skb() in various allocations are failable, so remove
__GFP_NOFAIL from their masks.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 10:55:05 -07:00
Steve Wise ba6d39256b RDMA/cxgb4: Add module option to tweak delayed ack
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 10:53:52 -07:00
Ben Hutchings dccb816de3 IB/ipath: Fix probe failure path
The failure path in ipath_init_one() does not match the cleanup code
in ipath_remove_one() and appears to leave interrupts enabled in some
cases.  Change it to match.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 10:48:39 -07:00
Joe Perches 78e2c6415a drivers/infiniband: Remove unnecessary casts of private_data
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-20 17:23:32 +02:00
Alexander Schmidt 91fb0dd9cb IB/ehca: Fix bitmask handling for lock_hcalls
Fix reading hcall locking capability bit from device capabilities.

Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-19 13:23:32 -07:00
Ralph Campbell cc323b2aaa IB/qib: Avoid variable-length array
Rather than use a variable size array allocation on the stack,
define a constant for the maximum array size possible.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-19 13:21:24 -07:00
Roland Dreier 85963e4cbc RDMA/cxgb4: Remove unneeded NULL check
The rest of the code seems to assume that ep->com.cm_id can't be NULL,
so remove an unneeded test.

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-19 13:13:09 -07:00
Dan Carpenter c1d7356c85 RDMA/cxgb4: Remove unneeded assignment
We don't need to assign rpl here, we do that later on.

Signed-off-by: Dan Carpenter <error27@gmail.com>

[ Indeed this assignment makes no sense, since skb is set to NULL a
  couple of lines before.  - Roland ]

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-19 13:09:40 -07:00
Roland Dreier ea9f3bc6d1 RDMA/nes: Rewrite expression to avoid undefined semantics
Change code like

	x = expr(++x)

that assigns to x twice without a sequence point in between to the
intended (and well-defined)

	x = expr(x + 1)

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-14 13:29:21 -07:00
Ben Hutchings ecd4b48a16 IB/qib: Use request_firmware() to load SD7220 firmware
Extract the microcode for the QLogic QLE7220 series IB HCA and use the
kernel microcode request facility to load the microcode.  This
supports Debian Linux's requirements to separate microcode which
doesn't have open source code available from the device driver.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-08 13:27:05 -07:00
Linus Torvalds e467e104bb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IPoIB: Fix world-writable child interface control sysfs attributes
  IB/qib: Clean up properly if qib_init() fails
  IB/qib: Completion queue callback needs to be single threaded
  IB/qib: Update 7322 serdes tables
  IB/qib: Clear 6120 hardware error register
  IB/qib: Clear eager buffer memory for each new process
  IB/qib: Mask hardware error during link reset
  IB/qib: Don't mark VL15 bufs as WC to avoid a rare 7322 chip problem
  RDMA/cxgb4: Derive smac_idx from port viid
  RDMA/cxgb4: Avoid false GTS CIDX_INC overflows
  RDMA/cxgb4: Don't call abort_connection() for active connect failures
  RDMA/cxgb4: Use the DMA state API instead of the pci equivalents
2010-07-08 12:20:54 -07:00
Roland Dreier 9e770044a0 Merge branches 'cxgb4', 'ipoib' and 'qib' into for-next 2010-07-08 09:10:24 -07:00
Ralph Campbell 756a33b8dc IB/qib: Clean up properly if qib_init() fails
If qib_init() fails, the driver fails to free memory, unregister
device files, and unregister with the PCIe framework. The driver will
unload without error but a subsequent driver load will cause the
system to panic.  This was found by changing the 7220 code to load the
serdes microcode separately and not installing the microcode file.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:14:04 -07:00
Ralph Campbell 950aff5394 IB/qib: Completion queue callback needs to be single threaded
Workqueues aren't exactly equivalent to tasklets since the callback
function may be called from multiple CPUs before the callback returns.
This causes completion notification callbacks to have MT bugs since
they weren't expecting this behavior. The fix is to use a single
threaded work queue.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:13:58 -07:00
Ralph Campbell 7c7a416ef8 IB/qib: Update 7322 serdes tables
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:13:46 -07:00
Ralph Campbell 2d757a7ce0 IB/qib: Clear 6120 hardware error register
The hardware error register needs to be cleared or another interrupt
will be generated, thus causing an infinite loop.  This is a
regression introduced when removing debug output.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:13:40 -07:00
Ralph Campbell 5df4223a44 IB/qib: Clear eager buffer memory for each new process
The eager buffers are not being cleared before being mmapped into a
new user address space.  This is a potential security risk and should
be fixed.  Note that the eager header queue is already being cleared.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:13:21 -07:00
Ralph Campbell b9e03e0489 IB/qib: Mask hardware error during link reset
The HCA checks for certain hardware errors which can be falsely
triggered when the IB link is reset. The fix is to mask them rather
than report them.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:13:20 -07:00
Dave Olson fce24a9d28 IB/qib: Don't mark VL15 bufs as WC to avoid a rare 7322 chip problem
Don't set write combining via PAT on the VL15 buffers to avoid a rare
problem with unaligned writes from interrupt-flushed store buffers.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:13:20 -07:00
Steve Wise 2c5934bfc5 RDMA/cxgb4: Derive smac_idx from port viid
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:05:16 -07:00
Steve Wise 1973e8b8ed RDMA/cxgb4: Avoid false GTS CIDX_INC overflows
The T4 IQ hw design assumes CIDX_INC credits will be returned on a
regular basis and always before the CIDX counter crosses over the PIDX
counter.  For RDMA CQs, however, returning CIDX_INC credits is only
needed and desired when and if the CQ is armed for notification.  This
can lead to a GTS write returning credits that causes the HW to reject
the credit update because it causes CIDX to pass PIDX.  Once this
happens, the CIDX/PIDX counters get out of whack and an application
can miss a notification and get stuck blocked awaiting a notification.

To avoid this, we allocate the HW IQ 2x times the requested size.
This seems to avoid the false overflow failures.  If we see more
issues with this, then we'll have to add code in the poll path to
return credits periodically like when the amount reaches 1/2 the queue
depth).  I would like to avoid this as it adds a PCI write transaction
for applications that never arm the CQ (like most MPIs).

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:04:04 -07:00
Steve Wise b21ef16a8b RDMA/cxgb4: Don't call abort_connection() for active connect failures
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:02:54 -07:00
FUJITA Tomonori f38926aa1d RDMA/cxgb4: Use the DMA state API instead of the pci equivalents
This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:01:42 -07:00
Ben Hutchings 39827be26b IB/{nes, ipoib}: Pass supported flags to ethtool_op_set_flags()
Following commit 1437ce3983 "ethtool:
Change ethtool_op_set_flags to validate flags", ethtool_op_set_flags
takes a third parameter and cannot be used directly as an
implementation of ethtool_ops::set_flags.

Changes nes and ipoib driver to pass in the appropriate value.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04 11:48:14 -07:00
Jiri Kosina f1bbbb6912 Merge branch 'master' into for-next 2010-06-16 18:08:13 +02:00
Uwe Kleine-König 421f91d21a fix typos concerning "initiali[zs]e"
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-06-16 18:05:05 +02:00
Uwe Kleine-König 732bee7af3 fix typos concerning "hierarchy"
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-06-16 18:03:14 +02:00
Changli Gao d8d1f30b95 net-next: remove useless union keyword
remove useless union keyword in rtable, rt6_info and dn_route.

Since there is only one member in a union, the union keyword isn't useful.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10 23:31:35 -07:00
Al Viro 971b2e8a3f fix the deadlock in qib_fs
get_sb_single() calls fill_super with superblock locked; calling
deactivate_super() will deadlock immedately.  Moreover, if fill_super
callback returns an error, get_sb_single() will release the reference
to superblock itself just fine.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-06-04 17:16:27 -04:00
Linus Torvalds 3e9345edd8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/qib: Remove DCA support until feature is finished
  IB/qib: Use a single txselect module parameter for serdes tuning
  IB/qib: Don't rely on (undefined) order of function parameter evaluation
  IB/ucm: Use memdup_user()
  IB/qib: Fix undefined symbol error when CONFIG_PCI_MSI=n
2010-05-30 09:12:16 -07:00
Ralph Campbell 7145c45a06 IB/qib: Remove DCA support until feature is finished
The DCA code was left over from internal development to test the
hardware feature and allow performance testing.  The results were
mixed and will require some additional work to make full use of the
feature.  Therefore, it is being removed for now.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-27 11:04:48 -07:00
Akinobu Mita 1dee31f74f ehca: convert cpu notifier to return encapsulate errno value
By the previous modification, the cpu notifier can return encapsulate
errno value. This converts the cpu notifiers for ehca.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Cc: Christoph Raisch <raisch@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:48 -07:00
Ralph Campbell a77fcf8950 IB/qib: Use a single txselect module parameter for serdes tuning
As part of the earlier patches submitted and reviewed, it was agreed
to change the way serdes tuning parameters were specified to the
driver.  The updated patch got dropped by the linux-rdma email list so
the earlier version of qib_iba7322.c ended up being used.  This patch
updates qib_iab7322.c to the simpler, single parameter method of
setting the serdes parameters.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-26 16:37:39 -07:00
Roland Dreier f27ec1d6db IB/qib: Don't rely on (undefined) order of function parameter evaluation
Some of the qib sysfs code passes a buffer pointer into 
simple_read_from_buffer() but relies on a function call in another 
parameter of the same call to initialize that pointer.  Since the order
of evaluation of function parameters is undefined, this will break if
gcc chooses the wrong order.

Fix this by splitting the code into two separate function calls.

This was noticed because of warnings like the following on ppc:

    drivers/infiniband/hw/qib/qib_fs.c: In function 'portcntrs_2_read':
    drivers/infiniband/hw/qib/qib_fs.c:203: warning: 'counters' is used uninitialized in this function

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-26 13:15:06 -07:00
Ralph Campbell 7e3a1f4ab1 IB/qib: Fix undefined symbol error when CONFIG_PCI_MSI=n
This patch fixes a compile error saying qib_init_iba6120_funcs() is
undefined when CONFIG_PCI_MSI is not defined.  Thanks to Randy Dunlap
<randy.dunlap@oracle.com> for finding this and suggesting the fix.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-25 21:09:43 -07:00
Linus Torvalds 8e9815a0f8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/nes: Fix incorrect unlock in nes_process_mac_intr()
  RDMA/nes: Async event for closed QP causes crash
  RDMA/nes: Have ethtool read hardware registers for rx/tx stats
  RDMA/cxgb4: Only insert sq qid in lookup table
  RDMA/cxgb4: Support IB_WR_READ_WITH_INV opcode
  RDMA/cxgb4: Set fence flag for inv-local-stag work requests
  RDMA/cxgb4: Update some HW limits
  RDMA/cxgb4: Don't limit fastreg page list depth
  RDMA/cxgb4: Return proper errors in fastreg mr/pbl allocation
  RDMA/cxgb4: Fix overflow bug in CQ arm
  RDMA/cxgb4: Optimize CQ overflow detection
  RDMA/cxgb4: CQ size must be IQ size - 2
  RDMA/cxgb4: Register RDMA provider based on LLD state_change events
  RDMA/cxgb4: Detach from the LLD after unregistering RDMA device
  IB/ipath: Remove support for QLogic PCIe QLE devices
  IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters
  IB/mad: Make needlessly global mad_sendq_size/mad_recvq_size static
  IB/core: Allow device-specific per-port sysfs files
  mlx4_core: Clean up mlx4_alloc_icm() a bit
  mlx4_core: Fix possible chunk sg list overflow in mlx4_alloc_icm()
2010-05-25 12:05:17 -07:00
Roland Dreier acdc30b56a Merge branches 'cxgb4', 'misc', 'mlx4', 'nes' and 'qib' into for-next 2010-05-25 09:54:03 -07:00
Chien Tung b17e0969dc RDMA/nes: Fix incorrect unlock in nes_process_mac_intr()
Commit ce6e74f2 ("RDMA/nes: Make nesadapter->phy_lock usage
consistent") introduced a problem where phy_lock was only unlocked
within an if statement and so nes_process_mac_intr() could return with
phy_lock still held.  Fix this.

This was discovered because of the sparse warning:

    drivers/infiniband/hw/nes/nes_hw.c:2643:9: warning: context imbalance in 'nes_process_mac_intr' - different lock contexts for basic block

Reported-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-25 09:53:06 -07:00
Faisal Latif df02902313 RDMA/nes: Async event for closed QP causes crash
Under abnormal termination, modify_qp() closes the QP, and async event
(AE) handling also attempts to close the same QP, causing a crash.
Fix this by checking the state of the QP before processing the AE.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:12:54 -07:00
Faisal Latif 39942a028c RDMA/nes: Have ethtool read hardware registers for rx/tx stats
Enhance ethtool to read hardware registers for rcv/tx error stats.
Also add support for free pbl resources.  Remove cq depth stats, which
are not used.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:12:54 -07:00
Steve Wise 30a6a62fc3 RDMA/cxgb4: Only insert sq qid in lookup table
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:05 -07:00
Steve Wise 2f1fb507ee RDMA/cxgb4: Support IB_WR_READ_WITH_INV opcode
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:04 -07:00
Steve Wise 4ab1eb9c8d RDMA/cxgb4: Set fence flag for inv-local-stag work requests
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:04 -07:00
Steve Wise f64b88433c RDMA/cxgb4: Update some HW limits
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:03 -07:00
Steve Wise 25737bd4ca RDMA/cxgb4: Don't limit fastreg page list depth
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:03 -07:00
Steve Wise 841dba9a5a RDMA/cxgb4: Return proper errors in fastreg mr/pbl allocation
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:02 -07:00
Steve Wise 7ec45b9234 RDMA/cxgb4: Fix overflow bug in CQ arm
- wrap cq->cqidx_inc based on cq size.
- optimize t4_arm_cq logic.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:01 -07:00
Steve Wise 84172dee05 RDMA/cxgb4: Optimize CQ overflow detection
1) save the timestamp flit in the cq when we consume a CQE.

2) always compare the saved flit with the previous entry flit when
   reading the next CQE entry.  If the flits don't compare, then we
   have overflowed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:01 -07:00
Steve Wise 895cf5f3d6 RDMA/cxgb4: CQ size must be IQ size - 2
We need 1 extra entry for the status page and 1 to always have 1 free
entry to detect when the queue is full.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:00 -07:00
Steve Wise 1c01c53883 RDMA/cxgb4: Register RDMA provider based on LLD state_change events
The LLD now supports proper UP state change events, so move the RDMA
provider registration to UP path.

This fixes a crash when loading iw_cxgb4 _after_ the NFS/RDMA
transport is up and running.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:07:59 -07:00
Steve Wise fd388ce677 RDMA/cxgb4: Detach from the LLD after unregistering RDMA device
In the RDMA core unregister path, kernel users will be calling down
into the T4 provider to release resources.  So we cannot detach from
the LLD until this process completes.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:07:59 -07:00
Ralph Campbell f6d60848ba IB/ipath: Remove support for QLogic PCIe QLE devices
The ib_qib driver is taking over support for QLogic PCIe QLE devices,
so remove support for them from ib_ipath.  The ib_ipath driver now
supports only the obsolete QLogic Hyper-Transport IB host channel
adapter (model QHT7140).

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-23 22:14:25 -07:00
Ralph Campbell f931551baf IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters
Add a low-level IB driver for QLogic PCIe adapters.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-23 21:44:54 -07:00
Grant Likely cf9b59e9d3 Merge remote branch 'origin' into secretlab/next-devicetree
Merging in current state of Linus' tree to deal with merge conflicts and
build failures in vio.c after merge.

Conflicts:
	drivers/i2c/busses/i2c-cpm.c
	drivers/i2c/busses/i2c-mpc.c
	drivers/net/gianfar.c

Also fixed up one line in arch/powerpc/kernel/vio.c to use the
correct node pointer.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-22 00:36:56 -06:00
Grant Likely 4018294b53 of: Remove duplicate fields from of_platform_driver
.name, .match_table and .owner are duplicated in both of_platform_driver
and device_driver.  This patch is a removes the extra copies from struct
of_platform_driver and converts all users to the device_driver members.

This patch is a pretty mechanical change.  The usage model doesn't change
and if any drivers have been missed, or if anything has been fixed up
incorrectly, then it will fail with a compile time error, and the fixup
will be trivial.  This patch looks big and scary because it touches so
many files, but it should be pretty safe.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Sean MacLennan <smaclennan@pikatech.com>
2010-05-22 00:10:40 -06:00
Ralph Campbell 9a6edb60ec IB/core: Allow device-specific per-port sysfs files
Add a new parameter to ib_register_device() so that low-level device
drivers can pass in a pointer to a callback function that will be
called for each port that is registered in sysfs.  This allows
low-level device drivers to create files in

    /sys/class/infiniband/<hca>/ports/<N>/

without having to poke through the internals of the RDMA sysfs handling.

There is no need for an unregister function since the kobject
reference will go to zero when ib_unregister_device() is called.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-21 10:34:44 -07:00
Linus Torvalds f8965467f3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1674 commits)
  qlcnic: adding co maintainer
  ixgbe: add support for active DA cables
  ixgbe: dcb, do not tag tc_prio_control frames
  ixgbe: fix ixgbe_tx_is_paused logic
  ixgbe: always enable vlan strip/insert when DCB is enabled
  ixgbe: remove some redundant code in setting FCoE FIP filter
  ixgbe: fix wrong offset to fc_frame_header in ixgbe_fcoe_ddp
  ixgbe: fix header len when unsplit packet overflows to data buffer
  ipv6: Never schedule DAD timer on dead address
  ipv6: Use POSTDAD state
  ipv6: Use state_lock to protect ifa state
  ipv6: Replace inet6_ifaddr->dead with state
  cxgb4: notify upper drivers if the device is already up when they load
  cxgb4: keep interrupts available when the ports are brought down
  cxgb4: fix initial addition of MAC address
  cnic: Return SPQ credit to bnx2x after ring setup and shutdown.
  cnic: Convert cnic_local_flags to atomic ops.
  can: Fix SJA1000 command register writes on SMP systems
  bridge: fix build for CONFIG_SYSFS disabled
  ARCNET: Limit com20020 PCI ID matches for SOHARD cards
  ...

Fix up various conflicts with pcmcia tree drivers/net/
{pcmcia/3c589_cs.c, wireless/orinoco/orinoco_cs.c and
wireless/orinoco/spectrum_cs.c} and feature removal
(Documentation/feature-removal-schedule.txt).

Also fix a non-content conflict due to pm_qos_requirement getting
renamed in the PM tree (now pm_qos_request) in net/mac80211/scan.c
2010-05-20 21:04:44 -07:00
Linus Torvalds f39d01be4c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (44 commits)
  vlynq: make whole Kconfig-menu dependant on architecture
  add descriptive comment for TIF_MEMDIE task flag declaration.
  EEPROM: max6875: Header file cleanup
  EEPROM: 93cx6: Header file cleanup
  EEPROM: Header file cleanup
  agp: use NULL instead of 0 when pointer is needed
  rtc-v3020: make bitfield unsigned
  PCI: make bitfield unsigned
  jbd2: use NULL instead of 0 when pointer is needed
  cciss: fix shadows sparse warning
  doc: inode uses a mutex instead of a semaphore.
  uml: i386: Avoid redefinition of NR_syscalls
  fix "seperate" typos in comments
  cocbalt_lcdfb: correct sections
  doc: Change urls for sparse
  Powerpc: wii: Fix typo in comment
  i2o: cleanup some exit paths
  Documentation/: it's -> its where appropriate
  UML: Fix compiler warning due to missing task_struct declaration
  UML: add kernel.h include to signal.c
  ...
2010-05-20 09:20:59 -07:00
Grant Likely 61c7a080a5 of: Always use 'struct device.of_node' to get device node pointer.
The following structure elements duplicate the information in
'struct device.of_node' and so are being eliminated.  This patch
makes all readers of these elements use device.of_node instead.

(struct of_device *)->node
(struct dev_archdata *)->prom_node (sparc)
(struct dev_archdata *)->of_node (powerpc & microblaze)

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-18 16:10:44 -06:00
Roland Dreier ffebedb7ab Merge branches 'amso1100', 'bkl', 'cma', 'cxgb3', 'cxgb4', 'ipoib', 'iser', 'masked-atomics', 'misc', 'mthca' and 'nes' into for-next 2010-05-15 20:06:01 -07:00
Roland Dreier be4c9bad9d MAINTAINERS: Add cxgb4 and iw_cxgb4 entries
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-05 14:45:40 -07:00
Roland Dreier 617c9a7e39 RDMA/cxgb3: Shrink .text with compile-time init of handlers arrays
Using compile-time designated initializers for the handler arrays
instead of open-coding the initialization in iwch_cm_init() is (IMHO)
cleaner, and leads to substantially smaller code: on my x86-64 build,
bloat-o-meter shows:

add/remove: 0/1 grow/shrink: 4/3 up/down: 4/-1682 (-1678)
function                                     old     new   delta
tx_ack                                       167     168      +1
state_set                                     55      56      +1
start_ep_timer                                99     100      +1
pass_establish                               177     178      +1
act_open_req_arp_failure                      39      38      -1
sched                                         84      82      -2
iwch_cm_init                                 442      91    -351
work_handlers                               1328       -   -1328

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-28 14:57:40 -07:00
Jiri Kosina 6c9468e9eb Merge branch 'master' into for-next 2010-04-23 02:08:44 +02:00
Vladimir Sokolovsky 6fa8f71984 IB/mlx4: Add support for masked atomic operations
Add support for masked atomic operations (masked compare and swap,
masked fetch and add).

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 16:37:49 -07:00
Roland Dreier 53978b46cd RDMA/nes: Make unnecessarily global functions static
This allows the compiler to do a bit better; on my x86-64 build:

add/remove: 0/2 grow/shrink: 1/0 up/down: 2288/-2365 (-77)
function                                     old     new   delta
nes_init_phy                                 273    2561   +2288
nes_init_1g_phy                              469       -    -469
nes_init_2025_phy                           1896       -   -1896

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 15:58:28 -07:00
Chien Tung ce6e74f23d RDMA/nes: Make nesadapter->phy_lock usage consistent
nes_{read,write}_1G_phy_reg() are using phy_lock while
nes_{read,write}_10G_phy_reg() leave that to the caller.

Remove phy_lock from 1G routines and leave the locking to the caller.
Add additional phy_lock calls around 1G read/write.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 15:46:40 -07:00
Steve Wise cfdda9d764 RDMA/cxgb4: Add driver for Chelsio T4 RNIC
Add an RDMA/iWARP driver for Chelsio T4 Ethernet adapters.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 15:30:06 -07:00
FUJITA Tomonori 3a2baff783 IB/mthca: Use the dma state API instead of pci equivalents
The DMA API is preferred; no functional change.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 15:25:34 -07:00
FUJITA Tomonori e749444057 RDMA/amso1100: Use the dma state API instead of pci equivalents
The DMA API is preferred; no functional change.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 15:23:10 -07:00
Steve Wise 73a203d201 RDMA/cxgb3: Don't free skbs on NET_XMIT_* indications from LLD
The low level cxgb3 driver can return NET_XMIT_CN and friends.
The iw_cxgb3 driver should _not_ treat these as errors.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 15:21:28 -07:00
FUJITA Tomonori 7960d6b9de RDMA/cxgb3: Use the dma state API instead of pci equivalents
The DMA API is preferred; no functional change.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 15:17:38 -07:00
David S. Miller 871039f02f Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/stmmac/stmmac_main.c
	drivers/net/wireless/wl12xx/wl1271_cmd.c
	drivers/net/wireless/wl12xx/wl1271_main.c
	drivers/net/wireless/wl12xx/wl1271_spi.c
	net/core/ethtool.c
	net/mac80211/scan.c
2010-04-11 14:53:53 -07:00
Linus Torvalds 0eddb519b9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mlx4: Check correct variable for allocation failure
  RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp()
  RDMA/cm: Set num_paths when manually assigning path records
  IB/cm: Fix device_create() return value check
2010-04-09 11:53:06 -07:00
Roland Dreier 5091b35388 Merge branches 'cma', 'misc', 'mlx4' and 'nes' into for-linus 2010-04-09 09:14:21 -07:00
Dan Carpenter 7bd912998e IB/mlx4: Check correct variable for allocation failure
The intent here is to check the "mfrpl->mapped_page_list" allocation.
We checked "mfrpl->ibfrpl.page_list" earlier.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-07 14:18:14 -07:00
Chien Tung eadde3a1a5 RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp()
cap.max_inline_data is incorrectly set in init_attr instead of attr.
Set it in attr so subsequent init_attr.cap assignment will get the
correct value.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-07 14:16:08 -07:00
Jiri Pirko 22bedad3ce net: convert multicast list to list_head
Converts the list and the core manipulating with it to be the same as uc_list.

+uses two functions for adding/removing mc address (normal and "global"
 variant) instead of a function parameter.
+removes dev_mcast.c completely.
+exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for
 manipulation with lists on a sandbox (used in bonding and 80211 drivers)

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-03 14:22:15 -07:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Thomas Weber 8839316121 Fix typos in comments
[Ss]ytem => [Ss]ystem
udpate => update
paramters => parameters
orginal => original

Signed-off-by: Thomas Weber <swirl@gmx.li>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-03-16 11:47:56 +01:00
Linus Torvalds 122ce878dc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/nes: Fix CX4 link problem in back-to-back configuration
  RDMA/nes: Clear stall bit before destroying NIC QP
  RDMA/nes: Set assume_aligned_header bit
  RDMA/cxgb3: Wait at least one schedule cycle during device removal
  IB/mad: Ignore iWARP devices on device removal
  IPoIB: Include return code in trace message for ib_post_send() failures
  IPoIB: Fix TX queue lockup with mixed UD/CM traffic
2010-03-13 14:38:31 -08:00
Roland Dreier 0636b33c5f Merge branches 'cxgb3', 'ipoib', 'misc' and 'nes' into for-next 2010-03-12 10:54:20 -08:00
Chien Tung a72042c08a RDMA/nes: Fix CX4 link problem in back-to-back configuration
Commit 09124e19 ("RDMA/nes: Add support for KR device id 0x0110") took
out too much code and broke CX4 link detection in back-to-back
configuration.  Put back the code that does the link check.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-03-12 10:54:11 -08:00
Chien Tung 9f29006ae8 RDMA/nes: Clear stall bit before destroying NIC QP
Clear the stall bit to drop any incoming packets while destroying NIC
QP.  This will prevent a chip resource leak.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-03-11 15:12:15 -08:00
Faisal Latif 883c699241 RDMA/nes: Set assume_aligned_header bit
Set assume_aligned_header bit in QP context as requested by hardware group.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-03-11 15:11:12 -08:00
Steve Wise 69960a275e RDMA/cxgb3: Wait at least one schedule cycle during device removal
During a hot-plug LLD removal event or an EEH error event, iw_cxgb3
must ensure that any/all threads that might be in a cxgb3 exported
function must return from the function before iw_cxgb3 returns from
its event processing.  Do this by calling synchronize_net().

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-03-11 14:00:35 -08:00
Jiri Kosina 318ae2edc3 Merge branch 'for-next' into for-linus
Conflicts:
	Documentation/filesystems/proc.txt
	arch/arm/mach-u300/include/mach/debug-macro.S
	drivers/net/qlge/qlge_ethtool.c
	drivers/net/qlge/qlge_main.c
	drivers/net/typhoon.c
2010-03-08 16:55:37 +01:00
Linus Torvalds 3ff1562ea4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (48 commits)
  IB/srp: Clean up error path in srp_create_target_ib()
  IB/srp: Split send and recieve CQs to reduce number of interrupts
  RDMA/nes: Add support for KR device id 0x0110
  IB/uverbs: Use anon_inodes instead of private infinibandeventfs
  IB/core: Fix and clean up ib_ud_header_init()
  RDMA/cxgb3: Mark RDMA device with CXIO_ERROR_FATAL when removing
  RDMA/cxgb3: Don't allocate the SW queue for user mode CQs
  RDMA/cxgb3: Increase the max CQ depth
  RDMA/cxgb3: Doorbell overflow avoidance and recovery
  IB/core: Pack struct ib_device a little tighter
  IB/ucm: Clean whitespace errors
  IB/ucm: Increase maximum devices supported
  IB/ucm: Use stack variable 'base' in ib_ucm_add_one
  IB/ucm: Use stack variable 'devnum' in ib_ucm_add_one
  IB/umad: Clean whitespace
  IB/umad: Increase maximum devices supported
  IB/umad: Use stack variable 'base' in ib_umad_init_port
  IB/umad: Use stack variable 'devnum' in ib_umad_init_port
  IB/umad: Remove port_table[]
  IB/umad: Convert *cdev to cdev in struct ib_umad_port
  ...
2010-03-03 07:33:17 -08:00
Roland Dreier fe8875e5a4 Merge branch 'misc' into for-next
Conflicts:
	drivers/infiniband/core/uverbs_main.c
2010-03-01 23:52:31 -08:00
Roland Dreier 3bbddbada8 Merge branch 'nes' into for-next 2010-03-01 23:51:57 -08:00
Roland Dreier a835fb3095 Merge branch 'mlx4' into for-next 2010-03-01 23:51:56 -08:00
Roland Dreier 85f938a70c Merge branch 'ehca' into for-next 2010-03-01 23:51:55 -08:00
Jiri Pirko fbf219f1c8 infiniband: convert to use netdev_for_each_mc_addr
Due to the loop complexicity in nes_nic.c, I'm using char* to copy mc addresses
to it.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-26 04:22:27 -08:00
Chien Tung 09124e1913 RDMA/nes: Add support for KR device id 0x0110
Add support for KR device id 0x0110.  While at it, cleanup
nes_init_phy() by splitting it into nes_init_1g_phy() and
nes_init_2025_phy().

Remove support for NES_PHY_TYPE_IRIS, which was used on an XFP board
that was only manufactured in small quantities and given out for evals
in even smaller quantities.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-25 10:40:05 -08:00
Eli Cohen 920d706c89 IB/core: Fix and clean up ib_ud_header_init()
ib_ud_header_init() first clears header and then fills up the various
fields.  Later on, it tests header->immediate_present, which it has
already cleared, so the condition is always false.  Fix this by adding
an immediate_present parameter and setting header->immediate_present
as is done with grh_present.  Also remove unused calculation of
header_len.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24 14:54:10 -08:00
Steve Wise 68baf495d8 RDMA/cxgb3: Mark RDMA device with CXIO_ERROR_FATAL when removing
If cxgb3 calls the iw_cxgb3 t3cclient remove function due to a device
removal event, then the iwch device must be marked with CXIO_ERROR_FATAL
since the device below us is going away.  Otherwise, we can get stuck in
a deadlock as RDMA ULPs try and deallocate objects (like MRs, QPs, etc).
So always mark the device with CXIO_ERROR_FATAL when removing.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24 10:40:30 -08:00
Steve Wise 5279d3ac2d RDMA/cxgb3: Don't allocate the SW queue for user mode CQs
Only kernel mode CQs need the SW queue memory allocated.  The SW queue
for user mode CQs is allocated in userspace by libcxgb3.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24 10:40:29 -08:00
Steve Wise 9918b28d2b RDMA/cxgb3: Increase the max CQ depth
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24 10:40:29 -08:00
Steve Wise e998f245c4 RDMA/cxgb3: Doorbell overflow avoidance and recovery
T3 hardware doorbell FIFO overflows can cause application stalls due
to lost doorbell ring events.  This has been seen when running large
NP IMB alltoall MPI jobs.  The T3 hardware supports an xon/xoff-type
flow control mechanism to help avoid overflowing the HW doorbell FIFO.

This patch uses these interrupts to disable RDMA QP doorbell rings
when we near an overflow condition, and then turn them back on (and
ring all the active QP doorbells) when when the doorbell FIFO empties
out.  In addition if an doorbell ring is dropped by the hardware, the
code will now recover.

Design:

cxgb3:
- enable these DB interrupts
- in the interrupt handler, schedule work tasks to call the ULPs event
  handlers with the new events.
- ring all the qset txqs when an overflow is detected.

iw_cxgb3:
- disable db ringing on all active qps when we get the DB_FULL event
- enable db ringing on all active qps and ring all active dbs when we get
  the DB_EMPTY event
- On DB_DROP event:
       - disable db rings in the event handler
       - delay-schedule a work task which rings and enables the dbs on
         all active qps.
- in post_send and post_recv logic, don't ring the db if it's disabled.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24 10:40:28 -08:00
Or Gerlitz 831d06cf5b RDMA/nes: Change WQ overflow return code
Change the nes driver to return -ENOMEM on SQ/RQ overflow to match the
return code of other RDMA HW drivers (e.g cxgb3, ehca, mlx4, mthca).

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Acked-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19 13:51:46 -08:00
Faisal Latif 30b172ff8e RDMA/nes: Multiple disconnects cause crash during AE handling
There is a double disconnect during AE processing, causing crashes.
While fixing the crash, also simplify the AE handling code.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19 11:38:33 -08:00
Faisal Latif 43093b9412 RDMA/nes: Fix crash when listener destroyed during loopback setup
When a listener is destroyed and there is an MPA response pending for
loopback connection, the active side cm_node gets destroyed twice:
once in cm_event_connect_error() and again in nes_accept()/nes_reject().

Increment the cm_node's refcount so it's not destroyed by
cm_event_connect_error().

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19 11:38:27 -08:00
Faisal Latif 6e10d2e407 RDMA/nes: Use atomic counters for CM listener create and destroy
After running long iterative MPI tests, sometimes ethtool reports a
"CM Destroy Listener" count more than the "CM Create Listener" count.
This inconsistency is fixed by making counter variables atomic.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19 11:38:14 -08:00
Alexander Schmidt 45e354e3f2 IB/ehca: Require in_wc in process_mad()
If the caller does not pass a valid in_wc to process_mad(), return MAD
failure status, as it is not possible to generate a valid MAD redirect
response (and redirects are the only MAD responses ehca generates).

Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-19 11:13:39 -08:00
David S. Miller 2bb4646fce Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-02-16 22:09:29 -08:00
Alexander Schmidt fa55e30bc3 IB/ehca: Allow access for ib_query_qp()
The max_dest_rd_atomic and max_qp_rd_atomic values are properly
returned by query_qp(), so there should not be an error returned when
they are queried.

Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-12 15:25:06 -08:00
Alexander Schmidt 25ef756385 IB/ehca: Do not turn off irqs in tasklet context
The irq_spinlock is only taken in tasklet context, so it is safe not to
disable hardware interrupts.

Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-12 15:22:37 -08:00
Eli Cohen a478868a1b IB/mlx4: Simplify retrieval of ib_device
struct ib_qp  already holds a pointer to the ib device. No need to dive to the
hw device object to retrieve it.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-12 15:18:06 -08:00
Jiri Pirko 4cd24eaf0c net: use netdev_mc_count and netdev_mc_empty when appropriate
This patch replaces dev->mc_count in all drivers (hopefully I didn't miss
anything). Used spatch and did small tweaks and conding style changes when
it was suitable.

Jirka

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-12 11:38:58 -08:00
Jiri Slaby ccbe9f0b11 RDMA: Use rlimit helpers
Make sure compiler won't do weird things with limits by using the
rlimit helpers added in 3e10e716 ("resource: add helpers for fetching
rlimits").  E.g. fetching them twice may return 2 different values
after writable limits are implemented.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-11 15:40:48 -08:00
Steve Wise 2542322485 RDMA/cxgb3: Remove BUG_ON() on CQ rearm failure
Failure to rearm a CQ means the cxgb3 device is wedged, but we shouldn't
kill the whole system with a BUG_ON() if this happens.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-11 15:40:29 -08:00
Daniel Mack 3ad2f3fbb9 tree-wide: Assorted spelling fixes
In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.

Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Joe Perches <joe@perches.com>
Cc: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-02-09 11:13:56 +01:00
Al Viro 12e9a45609 Fix failure exit in ipathfs
deactivate_locked_super() will be done by caller of fill_super, doing
it there as well is b0rken.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-01-26 22:22:27 -05:00
David S. Miller 51c24aaaca Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-01-23 00:31:06 -08:00
H Hartley Sweeten eacc4d6a7d drivers/infiniband/hw/cxgb3/iwch_cm.c: use %pM to show MAC address
Use the %pM kernel extension to display the MAC address.

The only difference in the output is that the MAC address is
shown in the usual colon-separated hex notation.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-07 01:17:27 -08:00
Or Gerlitz 2b94607742 IB/mlx4: Fix queue overflow check in post_recv
In mlx4_ib_post_recv(), we should check the queue for overflow using
recv_cq instead of send_cq (current code looks like a copy-and-paste
mistake).

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-01-06 12:51:30 -08:00
Jack Morgenstein 4c425588e0 IB/mlx4: Initialize SRQ scatter entries when creating an SRQ
As for memfree mthca hardware, ConnectX also requires SRQ WQE scatter
entries to be initialized with the invalid L_Key at SRQ creation time.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-01-06 12:48:55 -08:00
Stefani Seibold 9842c38e91 kfifo: fix warn_unused_result
Fix the "ignoring return value of '...', declared with attribute
warn_unused_result" compiler warning in several users of the new kfifo
API.

It removes the __must_check attribute from kfifo_in() and
kfifo_in_locked() which must not necessary performed.

Fix the allocation bug in the nozomi driver file, by moving out the
kfifo_alloc from the interrupt handler into the probe function.

Fix the kfifo_out() and kfifo_out_locked() users to handle a unexpected
end of fifo.

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22 14:17:56 -08:00
Stefani Seibold 7acd72eb85 kfifo: rename kfifo_put... into kfifo_in... and kfifo_get... into kfifo_out...
rename kfifo_put...  into kfifo_in...  to prevent miss use of old non in
kernel-tree drivers

ditto for kfifo_get...  -> kfifo_out...

Improve the prototypes of kfifo_in and kfifo_out to make the kerneldoc
annotations more readable.

Add mini "howto porting to the new API" in kfifo.h

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22 14:17:56 -08:00
Stefani Seibold e64c026dd0 kfifo: cleanup namespace
change name of __kfifo_* functions to kfifo_*, because the prefix __kfifo
should be reserved for internal functions only.

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22 14:17:56 -08:00
Stefani Seibold c1e13f2567 kfifo: move out spinlock
Move the pointer to the spinlock out of struct kfifo.  Most users in
tree do not actually use a spinlock, so the few exceptions now have to
call kfifo_{get,put}_locked, which takes an extra argument to a
spinlock.

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22 14:17:56 -08:00
Stefani Seibold 4546548789 kfifo: move struct kfifo in place
This is a new generic kernel FIFO implementation.

The current kernel fifo API is not very widely used, because it has to
many constrains.  Only 17 files in the current 2.6.31-rc5 used it.
FIFO's are like list's a very basic thing and a kfifo API which handles
the most use case would save a lot of development time and memory
resources.

I think this are the reasons why kfifo is not in use:

 - The API is to simple, important functions are missing
 - A fifo can be only allocated dynamically
 - There is a requirement of a spinlock whether you need it or not
 - There is no support for data records inside a fifo

So I decided to extend the kfifo in a more generic way without blowing up
the API to much.  The new API has the following benefits:

 - Generic usage: For kernel internal use and/or device driver.
 - Provide an API for the most use case.
 - Slim API: The whole API provides 25 functions.
 - Linux style habit.
 - DECLARE_KFIFO, DEFINE_KFIFO and INIT_KFIFO Macros
 - Direct copy_to_user from the fifo and copy_from_user into the fifo.
 - The kfifo itself is an in place member of the using data structure, this save an
   indirection access and does not waste the kernel allocator.
 - Lockless access: if only one reader and one writer is active on the fifo,
   which is the common use case, no additional locking is necessary.
 - Remove spinlock - give the user the freedom of choice what kind of locking to use if
   one is required.
 - Ability to handle records. Three type of records are supported:
   - Variable length records between 0-255 bytes, with a record size
     field of 1 bytes.
   - Variable length records between 0-65535 bytes, with a record size
     field of 2 bytes.
   - Fixed size records, which no record size field.
 - Preserve memory resource.
 - Performance!
 - Easy to use!

This patch:

Since most users want to have the kfifo as part of another object,
reorganize the code to allow including struct kfifo in another data
structure.  This requires changing the kfifo_alloc and kfifo_init
prototypes so that we pass an existing kfifo pointer into them.  This
patch changes the implementation and all existing users.

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22 14:17:55 -08:00
Linus Torvalds e69381b417 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (45 commits)
  RDMA/cxgb3: Fix error paths in post_send and post_recv
  RDMA/nes: Fix stale ARP issue
  RDMA/nes: FIN during MPA startup causes timeout
  RDMA/nes: Free kmap() resources
  RDMA/nes: Check for zero STag
  RDMA/nes: Fix Xansation test crash on cm_node ref_count
  RDMA/nes: Abnormal listener exit causes loopback node crash
  RDMA/nes: Fix crash in nes_accept()
  RDMA/nes: Resource not freed for REJECTed connections
  RDMA/nes: MPA request/response error checking
  RDMA/nes: Fix query of ORD values
  RDMA/nes: Fix MAX_CM_BUFFER define
  RDMA/nes: Pass correct size to ioremap_nocache()
  RDMA/nes: Update copyright and branding string
  RDMA/nes: Add max_cqe check to nes_create_cq()
  RDMA/nes: Clean up struct nes_qp
  RDMA/nes: Implement IB_SIGNAL_ALL_WR as an iWARP extension
  RDMA/nes: Add additional SFP+ PHY uC status check and PHY reset
  RDMA/nes: Correct fast memory registration implementation
  IB/ehca: Fix error paths in post_send and post_recv
  ...
2009-12-16 10:32:31 -08:00
Roland Dreier 14f369d1d6 Merge branches 'amso1100', 'cma', 'cxgb3', 'ehca', 'ipath', 'ipoib', 'iser', 'misc', 'mlx4' and 'nes' into for-next 2009-12-15 23:39:25 -08:00
Frank Zago 48617f862f RDMA/cxgb3: Fix error paths in post_send and post_recv
Always set bad_wr when an immediate error is detected.  Return ENOMEM
for queue full instead of EINVAL to match other drivers.

Signed-off-by: Frank Zago <fzago@systemfabricworks.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-15 23:39:10 -08:00
Linus Torvalds d0316554d3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
  m68k: rename global variable vmalloc_end to m68k_vmalloc_end
  percpu: add missing per_cpu_ptr_to_phys() definition for UP
  percpu: Fix kdump failure if booted with percpu_alloc=page
  percpu: make misc percpu symbols unique
  percpu: make percpu symbols in ia64 unique
  percpu: make percpu symbols in powerpc unique
  percpu: make percpu symbols in x86 unique
  percpu: make percpu symbols in xen unique
  percpu: make percpu symbols in cpufreq unique
  percpu: make percpu symbols in oprofile unique
  percpu: make percpu symbols in tracer unique
  percpu: make percpu symbols under kernel/ and mm/ unique
  percpu: remove some sparse warnings
  percpu: make alloc_percpu() handle array types
  vmalloc: fix use of non-existent percpu variable in put_cpu_var()
  this_cpu: Use this_cpu_xx in trace_functions_graph.c
  this_cpu: Use this_cpu_xx for ftrace
  this_cpu: Use this_cpu_xx in nmi handling
  this_cpu: Use this_cpu operations in RCU
  this_cpu: Use this_cpu ops for VM statistics
  ...

Fix up trivial (famous last words) global per-cpu naming conflicts in
	arch/x86/kvm/svm.c
	mm/slab.c
2009-12-14 09:58:24 -08:00
Linus Torvalds 4ef58d4e2a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
  tree-wide: fix misspelling of "definition" in comments
  reiserfs: fix misspelling of "journaled"
  doc: Fix a typo in slub.txt.
  inotify: remove superfluous return code check
  hdlc: spelling fix in find_pvc() comment
  doc: fix regulator docs cut-and-pasteism
  mtd: Fix comment in Kconfig
  doc: Fix IRQ chip docs
  tree-wide: fix assorted typos all over the place
  drivers/ata/libata-sff.c: comment spelling fixes
  fix typos/grammos in Documentation/edac.txt
  sysctl: add missing comments
  fs/debugfs/inode.c: fix comment typos
  sgivwfb: Make use of ARRAY_SIZE.
  sky2: fix sky2_link_down copy/paste comment error
  tree-wide: fix typos "couter" -> "counter"
  tree-wide: fix typos "offest" -> "offset"
  fix kerneldoc for set_irq_msi()
  spidev: fix double "of of" in comment
  comment typo fix: sybsystem -> subsystem
  ...
2009-12-09 19:43:33 -08:00
Faisal Latif 7a576dfd9e RDMA/nes: Fix stale ARP issue
When the remote node's ethernet address changes, the connection keeps
trying to connect using the old address.  The connection wil continue
failing until the driver is unloaded and loaded again (eiter reboot or
rmmod).  Fix this by checking that the NIC has the correct address
before starting a connection.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:54:33 -08:00
Faisal Latif b1190d3e0d RDMA/nes: FIN during MPA startup causes timeout
A FIN that is received during an MPA start up sequence causes a
timeout in iwcm.c.  The connection has not been completely closed so
the iwcm code is waiting for resources to be cleaned up.  This closes
the connection so everything cleans up correctly.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:54:32 -08:00
Faisal Latif d2fa9b26e1 RDMA/nes: Free kmap() resources
We fail when creating many qps as kmap() fails for sq_vbase.
Fix this by doing kunmap() as soon as we are done with sq_vbase.
We do kunmap() in one of the locations below:

(1) nes_destroy_qp()
(2) nes_accept()
(3) nes_connect_event

We keep a flag to avoid multiple calls to kunmap().

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:54:28 -08:00
Faisal Latif fd000e12a5 RDMA/nes: Check for zero STag
STags are generated randomly but the driver does not correctly prevent
a zero STag.  Using STag zero is privileged and causes a user space
application to fail.  This change prevents the driver from trying to
allocate a zero STag.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:54:23 -08:00
Faisal Latif 886f98a315 RDMA/nes: Fix Xansation test crash on cm_node ref_count
While running a Xansation test, an active side node crashed.  The
problem started on the passive side, which generated an STtag that was
0.  The passive side sent a TERMINATE instead of an MPA REJECT msg.
The active side, receives TERMINATE and sends connect_err() and set
the cm_node state to CLOSED.  The passive side sends FIN + ACK after
TERMINATE.  Active side ends up in handle_ack_pkt() and send_reset().
send_reset() consumes 1 cm_node's ref_count.  Because the cm_node is
in CLOSED state, which means that cm_node will be destroyed after
completion of the connect_err() indication, CM will crash after
send_reset().

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:54:18 -08:00
Faisal Latif f9f3f1e08b RDMA/nes: Abnormal listener exit causes loopback node crash
When the listener is destroyed for a loopback connection, the listener
node gets a reset event.  This causes a crash as the listener is not
expecting a reset event.  Code review of cm_event_reset() during
debugging showed the cm_id ref count is incremented after calling its
event handler and not before.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:54:14 -08:00
Faisal Latif c5a7d48971 RDMA/nes: Fix crash in nes_accept()
While running IMP_EXT's window test, we saw a crash in nes_accept().
Here is the sequence of what happened:

(1) In MVAPICH2, connect request is received for port #0.

FIX:  Add a nes_connect() check to make sure local or remote tcp port
      is not 0.

(2) Remote node's (passive) TCP stack sends a reset when it gets a
    connect request because of port = 0.  Active side set the connect
    error to IW_CM_EVENT_STATUS_REJECTED when it received the RST from
    remote node.

FIX: The corect error code is -ECONNRESET.

(3) Wrong error code of IW_CM_EVENT_STATUS_REJECTED causes the core to
    destroy its listener ports.  Here there are connections that may
    have sent an MPA request up and waiting for accept or reject.  But
    the listener and its cm_nodes have been freed already causing the
    crash noticed.

FIX: The cm_node is freed only if its state is not
     NES_CM_STATE_MPAREQ_RCVD.  If cm_node's state is
     NES_CM_STATE_MPAREQ_RCVD then its new state is set to
     NES_CM_STATE_LISTENER_DESTROYED and it is not freed.  When
     nes_accept() or nes_reject() is received, its state is checked
     for NES_CM_STATE_LISTENER_DESTROYED and in this case the cm_node
     is freed and error is returned.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:54:08 -08:00
Faisal Latif 69524e1aff RDMA/nes: Resource not freed for REJECTed connections
During testing of REJECT connection error handling, we saw that the
cm_id resources are not released.  When the retransmit timer expires,
we need to send a reset message to remote node before issuing the
ABORTED event.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:54:03 -08:00
Faisal Latif 1cf078c995 RDMA/nes: MPA request/response error checking
During Xansation testing, we saw that error handling of MPA frame
msg/response is not handled properly.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:53:54 -08:00
Faisal Latif 8ac7f6e1af RDMA/nes: Fix query of ORD values
The ORD size needs updating as we are supporting more inbound READ
resources per connection.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:53:46 -08:00
Faisal Latif 9b84dbe7f4 RDMA/nes: Fix MAX_CM_BUFFER define
Change MAX_CM_BUFFER for MPA frames to be conformant to RFC 5044:
we need 512 + 20 instead of 512.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:53:36 -08:00
Julia Lawall d85ddd835b RDMA/nes: Pass correct size to ioremap_nocache()
The size argument to ioremap_nocache should be the size of desired
information, not the pointer to it.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@expression@
expression *x;
@@

x =
 <+...
*sizeof(x)
...+>// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:21:57 -08:00
Chien Tung fa6c87d510 RDMA/nes: Update copyright and branding string
Update copyright from Intel-NE, Inc. to Intel Corporation.  Use proper
branding string in Kconfig and simplify description.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:21:56 -08:00
Chien Tung 5924aea6e2 RDMA/nes: Add max_cqe check to nes_create_cq()
Add a check to nes_create_cq() to return -EINVAL if creating a CQ with
depth > max_cqe (32766).

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:21:56 -08:00
Chien Tung 75742c630e RDMA/nes: Clean up struct nes_qp
Remove unused and not really used variables.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:21:56 -08:00
Chien Tung d14152da13 RDMA/nes: Implement IB_SIGNAL_ALL_WR as an iWARP extension
Add IB_SINGAL_ALL_WR support as an iWARP extension.  If set, make sure
all WR for the QP are signalled.  Consolidate flags used in nesqp
structure.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:21:56 -08:00
Chien Tung a276510328 RDMA/nes: Add additional SFP+ PHY uC status check and PHY reset
Add additional PHY uC status check in case PHY firmware is not running
properly with heartbeat.  Add a hard PHY reset if uC status is 0x0
after initial reset.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:21:56 -08:00
Chien Tung e293a26fe9 RDMA/nes: Correct fast memory registration implementation
Replace alloc_fmr, unmap_fmr, dealloc_fmr and map_phys_fmr with
alloc_fast_reg_mr, alloc_fast_reg_page_list, free_fast_reg_page_list.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:21:54 -08:00
Frank Zago e147de0361 IB/ehca: Fix error paths in post_send and post_recv
Always set bad_wr when an immediate error is detected.  Do not report
success if an error occurred.

Signed-off-by: Frank Zago <fzago@systemfabricworks.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 15:07:25 -08:00
Frank Zago c597b0240b RDMA/amso1100: Fix error paths in post_send and post_recv
Always set bad_wr when an immediate error is detected.

Signed-off-by: Frank Zago <fzago@systemfabricworks.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 14:56:11 -08:00
Chien Tung 649fe4aeab RDMA/nes: Add support for IB_WR_*INV
Add support for IB_WR_SEND_WITH_INV, IB_WR_RDMA_READ_WITH_INV
and IB_WR_LOCAL_INV.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 13:51:37 -08:00
Frank Zago 4293fdc115 RDMA/nes: In nes_post_recv() always set bad_wr on error
On error, set bad_wr in nes_post_recv().  Stop processing ib_wr queue
when an error is detected.

Signed-off-by: Frank Zago <fzago@systemfabricworks.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 13:51:36 -08:00
Frank Zago e5dec39474 RDMA/nes: In nes_post_send() always set bad_wr on error
On error, set bad_wr in nes_post_send().  Stop processing ib_wr queue
when an error is detected.

Signed-off-by: Frank Zago <fzago@systemfabricworks.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 13:51:36 -08:00
Alexander Schmidt 9420269428 IB/ehca: Rework destroy_eq()
The ibmebus_free_irq() function, which might sleep, was called with
interrupts disabled.  To fix this, make sure that no interrupts are
running by killing the interrupt tasklet.  Also lock the
shca_list_lock to protect against the poll_eqs_timer running
concurrently.

Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 10:11:04 -08:00
Akinobu Mita 598cb6f327 IB/ipath: Use bitmap_weight()
Use bitmap_weight() instead of finding all set bits in bitmap by hand.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Ralph Campbell <infinipath@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 10:05:28 -08:00
Linus Torvalds 18821b0408 Merge branch 'bkl-drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'bkl-drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  agp: Remove the BKL from agp_open
  inifiband: Remove BKL from ipath_open()
  mips: Remove BKL from tb0219
  drivers: Remove BKL from scx200_gpio
  drivers: Remove BKL from pc8736x_gpio
  parisc: Remove BKL from eisa_eeprom
  rtc: Remove BKL from efirtc
  input: Remove BKL from hp_sdc_rtc
  hw_random: Remove BKL from core
  macintosh: Remove BKL from ans-lcd
  nvram: Drop the bkl from non-generic nvram_llseek()
  nvram: Drop the bkl from nvram_llseek()
  mem_class: Drop the bkl from memory_open()
  spi: Remove BKL from spidev_open
  drivers: Remove BKL from cs5535_gpio
  drivers: Remove BKL from misc_open
2009-12-09 08:07:38 -08:00
André Goddard Rosa af901ca181 tree-wide: fix assorted typos all over the place
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 15:39:55 +01:00
David S. Miller 3505d1a9fd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/sfc/sfe4001.c
	drivers/net/wireless/libertas/cmd.c
	drivers/staging/Kconfig
	drivers/staging/Makefile
	drivers/staging/rtl8187se/Kconfig
	drivers/staging/rtl8192e/Kconfig
2009-11-18 22:19:03 -08:00
Eli Cohen 417608c20a IB/mlx4: Remove limitation on LSO header size
Current code has a limitation: an LSO header is not allowed to cross a
64 byte boundary.  This patch removes this limitation by setting the
WQE RR for large headers thus allowing LSO headers of any size.  The
extra buffer reserved for MLX4_IB_QP_LSO QPs has been doubled, from 64
to 128 bytes, assuming this is reasonable upper limit for header
length.  Also, this patch will cause IB_DEVICE_UD_TSO to be set only
for HCA FW versions that set MLX4_DEV_CAP_FLAG_BLH; e.g. FW version
2.6.000 and higher.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-11-12 11:19:44 -08:00
Eli Cohen ecdc428e4c IB/mlx4: Remove unneeded code
There is no such flag DE - the field is reserved and should be zero.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-11-12 11:14:13 -08:00
Uwe Kleine-König 21ae2956ce tree-wide: fix typos "aquire" -> "acquire", "cumsumed" -> "consumed"
This patch was generated by

	git grep -E -i -l '[Aa]quire' | xargs -r perl -p -i -e 's/([Aa])quire/$1cquire/'

and the cumsumed was found by checking the diff for aquire.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-11-09 09:40:57 +01:00
Thomas Gleixner f96d3015e9 inifiband: Remove BKL from ipath_open()
cycle_kernel_lock() got pushed down to ipath_open(). I tried hard to
understand what it might protect, but finally gave up.

Roland noted that qlogic seems to have abandoned the ipath driver and
came to the following wise conclusion: "So I guess if the BKL stuff is
blocking you in any way, we can just drop it from ipath and leave it
as yet another race condition in a rotting old driver."

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <adad44tj090.fsf@cisco.com>
Cc: Roland Dreier <rdreier@cisco.com>
2009-10-14 17:36:54 +02:00
Alexey Dobriyan d43c36dc6b headers: remove sched.h from interrupt.h
After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-10-11 11:20:58 -07:00
Steve Wise e5da4ed8a4 RDMA/cxgb3: Handle NULL inetdev pointer in iwch_query_port()
in_dev_get() can return NULL.  If it does, iwch_query_port() will crash.
Handle the NULL case by mapping it to port state INIT.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-10-07 15:51:07 -07:00
Ben Hutchings 15f0a394c6 net: Convert ethtool {get_stats, self_test}_count() ops to get_sset_count()
These string query operations were supposed to be replaced by the
generic get_sset_count() starting in 2007.  Convert the remaining
implementations.

Also remove calls to these operations to initialise drvinfo->n_stats.
The ethtool core code already does that.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-05 00:10:10 -07:00
Christoph Lameter ca0c9584b1 this_cpu: Straight transformations
Use this_cpu_ptr and __this_cpu_ptr in locations where straight
transformations are possible because per_cpu_ptr is used with
either smp_processor_id() or raw_smp_processor_id().

cc: David Howells <dhowells@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
cc: Ingo Molnar <mingo@elte.hu>
cc: Rusty Russell <rusty@rustcorp.com.au>
cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-10-03 19:48:22 +09:00
Alexey Dobriyan f0f37e2f77 const: mark struct vm_struct_operations
* mark struct vm_area_struct::vm_ops as const
* mark vm_ops in AGP code

But leave TTM code alone, something is fishy there with global vm_ops
being used.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-27 11:39:25 -07:00
Linus Torvalds d7757be133 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IPoIB: Don't turn on carrier for a non-active port
  IB/mthca: Fix access to freed memory in catastrophic event handling
  mlx4_core: Pass cache line size to device FW
  RDMA/nes: Remove duplicate .ndo_set_mac_address field initialization
  IB/mad: Fix lock-lock-timer deadlock in RMPP code
2009-09-24 17:06:01 -07:00
Roland Dreier 216c7f92b9 Merge branches 'ipoib', 'mad', 'mlx4', 'mthca' and 'nes' into for-linus 2009-09-24 12:43:08 -07:00
Jack Morgenstein d686159e50 IB/mthca: Fix access to freed memory in catastrophic event handling
catas_reset() uses a pointer to mthca_dev, but mthca_dev is not valid
after the call to __mthca_restart_one().

Based on a similar patch for mlx4 (634354d7, "mlx4: Fix access to
freed memory") by Vitaliy Gusev <vgusev@openvz.org>

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-24 11:55:41 -07:00
Julia Lawall bdf643816a RDMA/nes: Remove duplicate .ndo_set_mac_address field initialization
The definition of nes_netdev_ops has initializations of a local function
and eth_mac_addr for its ndo_set_mac_address field.  This change uses only
the local function.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
identifier I, s, fld;
position p0,p;
expression E;
@@

struct I s =@p0 { ... .fld@p = E, ...};

@s@
identifier I, s, r.fld;
position r.p0,p;
expression E;
@@

struct I s =@p0 { ... .fld@p = E, ...};

@script:python@
p0 << r.p0;
fld << r.fld;
ps << s.p;
pr << r.p;
@@

if int(ps[0].line)!=int(pr[0].line) or int(ps[0].column)!=int(pr[0].column):
  cocci.print_main(fld,p0)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-24 10:59:34 -07:00
KAMEZAWA Hiroyuki 908eedc616 walk system ram range
Originally, walk_memory_resource() was introduced to traverse all memory
of "System RAM" for detecting memory hotplug/unplug range.  For doing so,
flags of IORESOUCE_MEM|IORESOURCE_BUSY was used and this was enough for
memory hotplug.

But for using other purpose, /proc/kcore, this may includes some firmware
area marked as IORESOURCE_BUSY | IORESOUCE_MEM.  This patch makes the
check strict to find out busy "System RAM".

Note: PPC64 keeps their own walk_memory_resouce(), which walk through
ppc64's lmb informaton.  Because old kclist_add() is called per lmb, this
patch makes no difference in behavior, finally.

And this patch removes CONFIG_MEMORY_HOTPLUG check from this function.
Because pfn_valid() just show "there is memmap or not* and cannot be used
for "there is physical memory or not", this function is useful in generic
to scan physical memory range.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Américo Wang <xiyou.wangcong@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23 07:39:41 -07:00
Anand Gadiyar 411c940385 trivial: fix typo "for for" in multiple files
trivial: fix typo "for for" in multiple files

Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-09-21 15:14:54 +02:00
David Brownell a4dbd6740d driver model: constify attribute groups
Let attribute group vectors be declared "const".  We'd
like to let most attribute metadata live in read-only
sections... this is a start.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-15 09:50:47 -07:00
Linus Torvalds d7e9660ad9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1623 commits)
  netxen: update copyright
  netxen: fix tx timeout recovery
  netxen: fix file firmware leak
  netxen: improve pci memory access
  netxen: change firmware write size
  tg3: Fix return ring size breakage
  netxen: build fix for INET=n
  cdc-phonet: autoconfigure Phonet address
  Phonet: back-end for autoconfigured addresses
  Phonet: fix netlink address dump error handling
  ipv6: Add IFA_F_DADFAILED flag
  net: Add DEVTYPE support for Ethernet based devices
  mv643xx_eth.c: remove unused txq_set_wrr()
  ucc_geth: Fix hangs after switching from full to half duplex
  ucc_geth: Rearrange some code to avoid forward declarations
  phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs
  drivers/net/phy: introduce missing kfree
  drivers/net/wan: introduce missing kfree
  net: force bridge module(s) to be GPL
  Subject: [PATCH] appletalk: Fix skb leak when ipddp interface is not loaded
  ...

Fixed up trivial conflicts:

 - arch/x86/include/asm/socket.h

   converted to <asm-generic/socket.h> in the x86 tree.  The generic
   header has the same new #define's, so that works out fine.

 - drivers/net/tun.c

   fix conflict between 89f56d1e9 ("tun: reuse struct sock fields") that
   switched over to using 'tun->socket.sk' instead of the redundantly
   available (and thus removed) 'tun->sk', and 2b980dbd ("lsm: Add hooks
   to the TUN driver") which added a new 'tun->sk' use.

   Noted in 'next' by Stephen Rothwell.
2009-09-14 10:37:28 -07:00
Roland Dreier 45c448a1c0 Merge branches 'cxgb3', 'ehca', 'ipath', 'ipoib', 'misc', 'mlx4', 'mthca' and 'nes' into for-linus 2009-09-10 21:18:07 -07:00
Steve Wise ffc40c6433 RDMA/cxgb3: Clean up properly on FW mismatch failures
FW mismatches can cause a crash in the iw_cxgb3 event handler.

- NULL the t3cdev->ulp pointer on failures in cxio_rdev_open()
- Silently ignore events when the ulp ptr is NULL in iwch_err_handler()

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-09 11:25:56 -07:00
Steve Wise 13a239330a RDMA/cxgb3: Don't ignore insert_handle() failures
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-09 11:25:55 -07:00
Chien Tung cd1d3f7abe RDMA/nes: Map MTU to IB_MTU_* and correctly report link state
Old query_port code reports static MTU and link state values.
Instead, map actual MTU to next largest IB_MTU_* constant and
correctly report link state.

Cc: Steve Wise <swise@opengridcomputing.com>
Reported-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:39 -07:00
Don Wood b29a4fc49b RDMA/nes: Rework the disconn routine for terminate and flushing
The disconn routine has been reworked to acoomodate the terminate and
flushing changes.  The routine has been reorganized to make all the
decisions at the start then it performs all the required operations.
This simplified the lock handling and is easier to follow.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:39 -07:00
Don Wood 320cdfd21d RDMA/nes: Use the flush code to fill in cqe error
Use the flush status to fill in cqe status when a specific error has
been identified.  Subsequent flushed completions still use the flushed
value.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:39 -07:00
Don Wood 6eed5e7c8b RDMA/nes: Make poll_cq return correct number of wqes during flush
When a flush request is given to the hw, it will place one cqe marked
as flushed (unless there is nothing to flush).  An application that is
waiting for all wqe's to complete will be left hanging.  This modifies
poll_cq to return the correct number of flushes for the pending
elements on the wq.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:39 -07:00
Don Wood 4b281faec3 RDMA/nes: Use flush mechanism to set status for wqe in error
When an asynchronous event occurs that requires a terminate, it is
sometimes possible to identify the wqe in error.  This change uses
flush to get this information to the poll routine.  The flush
operation puts the status into the cqe.  If this information is not
available, it continues to use the more generic flush code as before.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:38 -07:00
Don Wood 8b1c9dc4ba RDMA/nes: Implement Terminate Packet
Implement the sending and receiving of Terminate packets.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:38 -07:00
Don Wood 3c28b4457a RDMA/nes: Add CQ error handling
CQ errors are not being handled correctly.  Put in the the upcall for
CQ errors.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:38 -07:00
Don Wood 5ee21fe0ea RDMA/nes: Clean out CQ completions when QP is destroyed
When a QP is destroyed, unprocessed CQ entries could still reference
the QP.  This change zeroes the context value at QP destroy time.  By
skipping over cqe's with a zero context, poll_cq no longer processes a
cqe for a destroyed QP.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:37 -07:00
Don Wood ba0c5d9a89 RDMA/nes: Change memory allocation for cqp request to GFP_ATOMIC
The routine to allocate a cqp request is not called from process
context code.  Since it is not OK to sleep, it needs to use GFP_ATOMIC
not GFP_KERNEL.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:37 -07:00
Don Wood 873fcdd4bf RDMA/nes: Allocate work item for disconnect event handling
The code currently has a work structure in the QP.  This requires a
lock and a pending flag to ensure there is never more than one request
active.  When two events happen quickly (such as FIN and LLP CLOSE),
it causes unnecessary timeouts since the second one is dropped.

This fix allocates memory for the work request so the second one can
be queued.  A lock is removed since it is no longer needed.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:37 -07:00
Don Wood c4c3f279cd RDMA/nes: Update refcnt during disconnect
During termination, it is possible for the refcnt to go to zero while
the worker thread is posting events upward.  This fix increments the
refcnt before the request is passed to the worker thread.  The thread
decrements the refcnt when the request is completed.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:36 -07:00
Jack Morgenstein d841064777 IB/mthca: Don't allow userspace open while recovering from catastrophic error
Userspace apps are supposed to release all ib device resources if they
receive a fatal async event (IBV_EVENT_DEVICE_FATAL).  However, the
app has no way of knowing when the device has come back up, except to
repeatedly attempt ibv_open_device() until it succeeds.

However, currently there is no protection against the open succeeding
while the device is in being removed following the fatal event.  In
this case, the open will succeed, but as a result the device waits in
the middle of its removal until the new app releases its resources --
and the new app will not do so, since the open succeeded at a point
following the fatal event generation.

This patch adds an "active" flag to the device. The active flag is set
to false (in the fatal event flow) before the "fatal" event is
generated, so any subsequent ibv_dev_open() call to the device will
fail until the device comes back up, thus preventing the above
deadlock.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:16 -07:00
Arputham Benjamin d94a868901 IB/mthca: Distinguish multiple devices in /proc/interrupts
When the mthca driver uses the same name for interrupts for every
device in the system.  This can make it very confusing trying to work
out exactly which device MSI-X interrupts are for.  Change the driver
to add the PCI name of the device to the interrupt name.

Signed-off-by: Arputham Benjamin <abenjamin@sgi.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:15 -07:00
Roland Dreier ffe063f32b IB/mthca: Annotate CQ locking
mthca_ib_lock_cqs()/mthca_ib_unlock_cqs() are helper functions that
lock/unlock both CQs attached to a QP in the proper order to avoid
AB-BA deadlocks.  Annotate this so sparse can understand what's going
on (and warn us if we misuse these functions).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:15 -07:00
Roland Dreier deecb5d672 IB/mthca: Remove unnecessary include of <linux/init.h>
mthca_reset.c doesn't have any function annotations, so there's no
reason to include <linux/init.h>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:15 -07:00
Roland Dreier fc1285585f IB/mthca: Remove unnecessary include of <asm/page.h>
mthca_config_reg.h was including <asm/page.h> for no reason -- the whole
file is just defines of constants, so it's entirely self-contained.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:36:13 -07:00
Jack Morgenstein 3b4a8cd51e IB/mlx4: Don't allow userspace open while recovering from catastrophic error
Userspace apps are supposed to release all ib device resources if they
receive a fatal async event (IBV_EVENT_DEVICE_FATAL).  However, the
app has no way of knowing when the device has come back up, except to
repeatedly attempt ibv_open_device() until it succeeds.

However, currently there is no protection against the open succeeding
while the device is in being removed following the fatal event.  In
this case, the open will succeed, but as a result the device waits in
the middle of its removal until the new app releases its resources --
and the new app will not do so, since the open succeeded at a point
following the fatal event generation.

This patch adds an "active" flag to the device. The active flag is set
to false (in the fatal event flow) before the "fatal" event is
generated, so any subsequent ibv_dev_open() call to the device will
fail until the device comes back up, thus preventing the above
deadlock.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:24:50 -07:00
Roland Dreier 338a8fad27 IB/mlx4: Annotate CQ locking
mlx4_ib_lock_cqs()/mlx4_ib_unlock_cqs() are helper functions that
lock/unlock both CQs attached to a QP in the proper order to avoid
AB-BA deadlocks.  Annotate this so sparse can understand what's going
on (and warn us if we misuse these functions).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:24:49 -07:00
Roel Kluin 1493ab4083 RDMA/amso1100: Check kmalloc() result in c2_register_device()
dev->ibdev.iwcm allocation may fail, prevent a dereference.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:24:24 -07:00
Marcin Slusarz f1aa78b26e IB: Use printk_once() for driver versions
Replace open-coded reimplementations with printk_once().

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:24:24 -07:00
Tobias Klauser 181c74e87e RDMA/amso1100: Use %pM conversion specifier
Use the %pM conversion specifier to print a MAC address.

Signed-off-by: Tobias Klauser <klto@zhaw.ch>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:24:23 -07:00
Roel Kluin 286b63d096 IB/ipath: strncpy() doesn't always NUL-terminate
strlcpy() will always null terminate the string.  node_desc is not
guaranteed to be NUL-terminated so just use memcpy().

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:23:21 -07:00
Joachim Fenkes 6303e74c69 IB/ehca: Fix CQE flags reporting
The driver was reporting CQE flags in the wrong bit positions, causing
consumers to miss incoming immediate data.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:22:55 -07:00
Joachim Fenkes d706834d99 IB/ehca: Construct MAD redirect replies from request MAD
The old code used a lot of hard-coded values, which might not be valid
in all environments (especially routed fabrics or partitioned
subnets).  Copy as much information as possible from the incoming
request to correct that.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:22:55 -07:00
Alexander Schmidt 50d40b8e53 IB/ehca: Make port autodetect mode the default
Make port autodetect mode the default for the ehca driver. The
autodetect code has been in the kernel for several releases now and
has proved to be stable.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:22:54 -07:00
Steve Wise a52bf98d99 RDMA/cxgb3: Wake up any waiters on peer close/abort
A close/abort while waiting for a wr_ack during connection migration
can cause a hung process in iwch_accept_cr/iwch_reject_cr.

The fix is to set rpl_error/rpl_done and wake up the waiters when we
get a close/abort while in MPA_REQ_RCVD state.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:22:38 -07:00
Steve Wise 6e47fe4350 RDMA/cxgb3: Don't free endpoints early
- Keep ref on connection request endpoints until either accepted or
  rejected so it doesn't get freed early.

- Endpoint flags now need to be set via atomic bitops because they can
  be set on both the iw_cxgb3 workqueue thread and user disconnect
  threads.

- Don't move out of CLOSING too early due to multiple calls to
  iwch_ep_disconnect.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:22:38 -07:00
Steve Wise fa0d4c11c4 RDMA/cxgb3: Handle port events properly
Massage the err_handler upcall into an event handler upcall, pass
netdev port events to the cxgb3 ULPs and generate RDMA port events
based on LLD port events.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:22:38 -07:00
Steve Wise b496fe82d4 RDMA/cxgb3: Set the appropriate IO channel in rdma_init work requests
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:22:37 -07:00
Steve Wise 3793d2fc3e RDMA/cxgb3: iwch_unregister_device leaks memory
The iwcm struct mem is never freed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-09-05 20:22:36 -07:00
Eric Dumazet 451f144398 drivers: Kill now superfluous ->last_rx stores
The generic packet receive code takes care of setting
netdev->last_rx when necessary, for the sake of the
bonding ARP monitor.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Neil Horman <nhorman@txudriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-02 23:07:36 -07:00
Stephen Hemminger 0fc0b732ea netdev: drivers should make ethtool_ops const
No need to put ethtool_ops in data, they should be const.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-02 01:03:33 -07:00
Roland Dreier 4a7eca824c Merge branches 'ehca', 'misc', 'mlx4', 'mthca' and 'nes' into for-linus 2009-06-23 10:38:47 -07:00
Alexander Schmidt 1d4d6da535 IB/ehca: Bump version number
Increment version number for DMEM toleration.

Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-23 10:30:04 -07:00
Roland Dreier 99987bea47 IB/mthca: Replace dma_sync_single() use with proper functions
dma_sync_single() is deprecated now, and the use in mthca is wrong:
there should be a dma_sync_single_for_cpu() before touching the memory
from the CPU, and a dma_sync_single_for_device() afterwards.  Fix
this, prompted by a kick in the pants from a patch from FUJITA
Tomonori <fujita.tomonori@lab.ntt.co.jp>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 23:04:13 -07:00
Faisal Latif 68237a0ff8 RDMA/nes: Fix FIN state handling under error conditions
During cluster testing, one QP was not closed, as FIN is not handled
properly when its rexmit count expires or in some cases when RST is is
received after sending FIN.  The reason is that the cm_id does not get
decremented under these conditions.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 22:53:28 -07:00
Faisal Latif 66388d67a0 RDMA/nes: Fix max_qp_init_rd_atom returned from query device
In nes_query_device(), max_qp_init_rd_atom is incorrectly set to
max_qp_wr.  This was found when a test application had a dapl async
event error.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 22:52:30 -07:00
Roel Kluin af04662b4d IB/ehca: Ensure that guid_entry index is not negative
This prevents the memcpy() of a guid_entries element using a negative index.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 22:23:48 -07:00
Hannes Hering 0cf89dcdbc IB/ehca: Tolerate dynamic memory operations before driver load
Implement toleration of dynamic memory operations and 16 GB gigantic
pages, where "toleration" means that the driver can cope with dynamic
memory operations that happen before the driver is loaded.  While the
ehca driver is loaded, dynamic memory operations are still prohibited
by returning NOTIFY_BAD from the memory notifier.

On module load the driver walks through available system memory,
checks for available memory ranges and then registers the kernel
internal memory region accordingly.  The translation of address ranges
is implemented via a 3-level busmap.

Signed-off-by: Hannes Hering <hering2@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 22:18:51 -07:00
Greg Kroah-Hartman f899c2ddd4 infiniband: ehca: remove driver_data direct access of struct device
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device.  Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used.  These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.

Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: general@lists.openfabrics.org
Cc: Christoph Raisch <raisch@de.ibm.com>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:27 -07:00
Roland Dreier 8d34ff3401 Merge branches 'cxgb3', 'ehca', 'misc', 'mlx4', 'mthca' and 'nes' into for-linus 2009-06-14 13:31:19 -07:00
Roland Dreier 9aa0a489d9 IB/mthca: Don't double-free IRQs when falling back from MSI-X to INTx
When both MSI-X and legacy INTx fail to generate an interrupt, the
driver frees the MSI-X interrupts twice.  Fix this by clearing the
have_irq flag for the MSI-X interrupts when they are freed the first
time.

Reported-by: Yinghai Lu <yhlu.kernel@gmail.com>
Tested-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-13 15:14:09 -07:00
Jack Morgenstein 2ac6bf4ddc IB/mlx4: Add strong ordering to local inval and fast reg work requests
The ConnectX Programmer's Reference Manual states that the "SO" bit
must be set when posting Fast Register and Local Invalidate send work
requests.  When this bit is set, the work request will be executed
only after all previous work requests on the send queue have been
executed.  (If the bit is not set, Fast Register and Local Invalidate
WQEs may begin execution too early, which violates the defined
semantics for these operations)

This fixes the issue with NFS/RDMA reported in
<http://lists.openfabrics.org/pipermail/general/2009-April/059253.html>

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-05 10:36:24 -07:00
Joachim Fenkes 25a5239327 IB/ehca: Remove superfluous bitmasks from QP control block
All the fields in the control block are nicely right-aligned, so no
masking is necessary.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-03 13:25:42 -07:00
Steve Wise 3026c19a14 RDMA/cxgb3: Limit fast register size based on T3 limitations
T3 firmware only supports one WRs worth of page list for fast register
work requests.  The driver currently allows 2 WRs worth, which
doesn't work for T3, so reduce the limit in the driver.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-27 14:43:39 -07:00
Steve Wise 7ab1a2b31d RDMA/cxgb3: Report correct port state and MTU
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-27 14:42:36 -07:00
Eli Cohen c1f67a88bf IB/mthca: Add module parameter for number of MTTs per segment
The current MTT allocator uses kmalloc() to allocate a buffer for its
buddy allocator, and thus is limited in the amount of MTT segments
that it can control.  As a result, the size of memory that can be
registered is limited too.  This patch uses a module parameter to
control the number of MTT entries that each segment represents,
allowing more memory to be registered with the same number of
segments.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-27 14:36:16 -07:00
Roel Kluin 28e43a519b RDMA/nes: Fix off-by-one bugs in reset_adapter_ne020() and init_serdes()
With a postfix increment, i is incremented one past 10K/5K before the
loop ends, so the error messages will be displayed too soon if the
test succeeds on the last iteration.  Fix the comparisons to be >
instead of >=.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-15 10:16:45 -07:00
Jack Stone 5b891a9332 infiniband: Remove void casts
Remove uneeded casts of void *.

Signed-off-by: Jack Stone <jwjstone@fastmail.fm>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:53:39 -07:00
Stefan Roscher bde2cfaf8f IB/ehca: Increment version number
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:43 -07:00
Stefan Roscher 1988d1fa1a IB/ehca: Remove unnecessary memory operations for userspace queue pairs
The queue map for flush completion circumvention is only used for
kernel space queue pairs.  This patch skips the allocation of the
queue maps in case the QP is created for userspace.  In addition, this
patch does not iomap the galpas for kernel usage if the queue pair is
only used in userspace.  These changes will improve the performance of
creation of userspace queue pairs.

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:43 -07:00
Stefan Roscher c94f156f63 IB/ehca: Fall back to vmalloc() for big allocations
In case of large queue pairs there is the possibillity of allocation
failures due to memory fragmentation when using kmalloc().  To ensure
the memory is allocated even if kmalloc() can not find chunks which
are big enough, we fall back to allocating the memory with vmalloc().

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:42 -07:00
Anton Blanchard bf31a1a02e IB/ehca: Replace vmalloc() with kmalloc() for queue allocation
To improve performance of driver resource allocation, replace
vmalloc() calls with kmalloc().

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:40 -07:00
Linus Torvalds c98861f7de Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mlx4: Don't overwrite fast registration page list when posting work request
  RDMA/cxgb3: Don't complete flushed send work requests twice
2009-05-13 16:31:12 -07:00
Roland Dreier 8be741b0ac Merge branches 'cxgb3' and 'mlx4' into for-linus 2009-05-13 15:16:17 -07:00
Al Viro 265e771e81 Fix deadlock in ipathfs ->get_sb()
forgot to unlock superblock before calling deactivate_super()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09 10:49:40 -04:00
Jack Morgenstein 2b6b7d4be4 IB/mlx4: Don't overwrite fast registration page list when posting work request
The low-level mlx4 driver modified the page-list addresses for fast
register work requests post send to big-endian, and set a "present"
bit.  This caused problems later when the consumer attempted to unmap
the pages using the page-list (using the list addresses which were
assumed to be still in CPU-endian order).  Fix the mlx4 driver to
allocate two buffers and use a private buffer for the hardware-format
bus addresses.

This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1571>,
an NFS/RDMA server crash.  The cause of the crash was found by Vu Pham
of Mellanox.  The fix is along the lines suggested by Steve Wise in
comment #21 in bug 1571.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-07 21:35:13 -07:00
Steve Wise ec6995ddaa RDMA/cxgb3: Don't complete flushed send work requests twice
When the SQ is flushed, mark the flushed entries as not signaled so
the poll logic doesn't re-insert the CQ entry thinking its an out of
order completion.

The bug can cause the NFS/RDMA server to crash due to processing the
same completed work request twice.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-29 15:15:59 -07:00
Roland Dreier 9308f96c79 Merge branches 'cxgb3', 'ipoib', 'mthca', 'mlx4' and 'nes' into for-linus 2009-04-28 16:01:31 -07:00
Chien Tung 26cc5e57bb RDMA/nes: Update iw_nes version
Update version number to 1.5.0.0

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:46:29 -07:00
Faisal Latif 9256b25130 RDMA/nes: Fix error path in nes_accept()
If reg_phys_mem() fails, we need to free memory allocated for MPA
frame with private data before returning the error. Also move
nes_add_ref() after the reg_phys_mem() is successful.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:45:19 -07:00
Faisal Latif 109d67e4f1 RDMA/nes: Fix hang issues for large cluster dynamic connections
Running large cluster setup, we are hanging after many hours of
testing.  Fixing this required going over the code and making sure the
rexmit entry was properly removed based on the cm_node's state and
packet received.  Also when receiving a FIN packet, check seq# and
make sure there were no errors before calling handle_fin().

Following are the changes done in nes_cm.c:

* handle_ack_pkt() needs to return error value, so in case of error,
  handle_fin() is not called. Some cleanup done while going over the code.

* handle_rst_pkt(), handling of cm_node's NES_CM_STATE_LAST_ACK is missing.

* process_packet(), in case of FIN only packet is received, call
  check_seq() before processing.

* in handle_fin_pkt(), we are calling cleanup_retrans_entry() for all
  conditions, even if the packets need to be dropped.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:41:06 -07:00
Faisal Latif 4e9c390036 RDMA/nes: Increase rexmit timeout interval
Under heavy load with large cluster testing, it may take longer to
receive a response to MPA requests.  Change the driver to wait longer
after each rexmit to max time value.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:39:36 -07:00
Faisal Latif c11470f9f4 RDMA/nes: Check for sequence number wrap-around
check_seq() was not checking if the seq#s have wrapped.  Fix it.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:38:31 -07:00
Faisal Latif 53094c388f RDMA/nes: Do not set apbvt entry for loopback
When a connect request comes, apbvt should only be set for
non-loopback connections.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:37:34 -07:00
Chien Tung 1f0dba1e51 RDMA/nes: Fix unused variable compile warning when INFINIBAND_NES_DEBUG=n
Remove the NES_DEBUG that is causing the compile warning about an
unused variable when INFINIBAND_NES_DEBUG is not enabled.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:36:03 -07:00
Chien Tung 0e4562da9e RDMA/nes: Fix fw_ver in /sys
/sys/class/infiniband/nes?/fw_ver is not displaying firmware version
properly (it shows 0.0.0 with the current code).  Fill in the correct
firmware version number.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:33:48 -07:00
Chien Tung 923223776b RDMA/nes: Set trace length to 1 inch for SFP_D
With updated PHY firmware for SFP_D, setting the trace length to 1
inch for SFP_D provides a more stable link.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:30:35 -07:00
Chien Tung e998c25bc2 RDMA/nes: Enable repause timer for port 1
Enable repause timer for port 1.  Without this setting, under stress,
the chip may misbehave.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:29:42 -07:00
Chien Tung 366835e249 RDMA/nes: Correct CDR loop filter setting for port 1
In commit 1b949324 ("RDMA/nes: Fix SFP+ PHY initialization") there is
a mistake in the clean up code that removed port 1 CDR loop filter
settings for 10G cards other than CX4.  Put the correct setting back
for appropriate PHY types.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:28:41 -07:00
Chien Tung 010db4d127 RDMA/nes: Modify thermo mitigation to flip SerDes1 ref clk to internal
Change thermo mitigation code to flip the SerDes1 reference clock to
internal, to match the change in commit a4849fc1 ("RDMA/nes: Add
wide_ppm_offset parm for switch compatibility").

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:27:21 -07:00
Miroslaw Walukiewicz 5d1af5c832 RDMA/nes: Fix resource issues in nes_create_cq() and nes_destroy_cq()
In error paths where a CQ is not created, pbl is not freeed properly.

In nes_destroy_cq(), add the corresponding check for nescq->mcrqf to
not call nes_free_resource() when it is already done in nes_create_cq().

Signed-off-by: Miroslaw Walukiewicz <miroslaw.walukiewicz@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-21 16:16:48 -07:00
Matt Kraai cc005fa20c RDMA/nes: Remove root_256()'s unused pbl_count_256 parameter
Signed-off-by: Matt Kraai <kraai@ftbfs.org>
Acked-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-21 10:43:21 -07:00
Jack Morgenstein 8531f1f14a IB/mthca: Fix timeout for INIT_HCA and a few other commands
Commands INIT_HCA, CLOSE_HCA, SYS_EN, SYS_DIS, and CLOSE_IB all have 1
second timeouts.  For INIT_HCA this causes problems when had more than
2^18 are QPs configured, since the command takes more than 1 second to
complete.

All other commands have 60-second timeouts.  This patch makes the
above commands consistent with the rest of the commands (and with the
chip documentation).

This patch is an expansion of a patch from Arthur Kepner
<akepner@sgi.com> fixing just the INIT_HCA timeout.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 21:12:25 -07:00
Steve Wise cde9e2f930 RDMA/cxgb3: Don't zero QP attrs when moving to IDLE
QP attributes must stay initialized when moving back to IDLE.  Zeroing
them will crash the system in _flush_qp() if the QP is subsequently
moved to ERROR and back to IDLE.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 17:00:53 -07:00
Don Wood 3f32eb1185 RDMA/nes: Fix bugs in nes_reg_phys_mr()
The code incorrectly failed memory registration if the buffer was not
page aligned.  Also, the length field is mangled causing the hardware
to think the registration is much larger than it really is.

The fix is to remove the page alignment restriction as well the
incorrect length adjustment.  Also make sure that all buffers after
the first start at a page boundary, and all buffers except the last
end on a page boundary.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 14:53:00 -07:00
Chien Tung 1af9222b52 RDMA/nes: Fix compiler warning at nes_verbs.c:1955
Initialize pbl_count_256 to 0 to get rid of the warning:

    drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_reg_mr':
    drivers/infiniband/hw/nes/nes_verbs.c:1955: warning: 'pbl_count_256' may be used uninitialized in this function

Reported-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 14:50:36 -07:00
Steve Wise 96ac7e8892 RDMA/cxgb3: Adjust ORD/IRD (if needed) for peer2peer connections
NFS/RDMA currently fails to set up connections if peer2peer is on.
This is due to the fact that the NFS/RDMA client sets its ORD to 0.

If peer2peer is set, make sure the active side ORD is >= 1 and the
passive side IRD is >=1.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 13:53:15 -07:00
Linus Torvalds 0534c8cb5c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/nes: Add support for new SFP+ PHY
  RDMA/nes: Add wide_ppm_offset parm for switch compatibility
  RDMA/nes: Fix SFP+ PHY initialization
  RDMA/nes: Fix nes_nic_cm_xmit() error handling
  RDMA/nes: Fix error handling issues
  RDMA/nes: Fix incorrect casts on 32-bit architectures
  IPoIB: Document newish features
  RDMA/cma: Create cm id even when IB port is down
  RDMA/cma: Use rate from IPoIB broadcast when joining IPoIB multicast groups
  IPoIB: Avoid free_netdev() BUG when destroying a child interface
  mlx4_core: Don't leak mailbox for SET_PORT on Ethernet ports
  RDMA/cxgb3: Release dependent resources only when endpoint memory is freed.
  RDMA/cxgb3: Handle EEH events
  IB/mlx4: Use pgprot_writecombine() for BlueFlame pages
2009-04-09 16:42:26 -07:00
Roland Dreier 07306c0b98 Merge branches 'cma', 'cxgb3', 'ipoib', 'mlx4' and 'nes' into for-next 2009-04-08 14:28:21 -07:00
Chien Tung 4303565df4 RDMA/nes: Add support for new SFP+ PHY
Add new register settings for new SFP+ PHY/firmware.
Add new PHY to to nes_netdev_get/set_settings.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:27:56 -07:00
Chien Tung a4849fc157 RDMA/nes: Add wide_ppm_offset parm for switch compatibility
We have observed unstable link with a new BNT switch.

Add wide_ppm_offset parameter to allow the user to control the clock
ppm offset on the CX4 interface for better compatibility.  Default is
100ppm, setting it to 1 will increase it to 300ppm.  Change default
SerDes1 reference clock to external source.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:27:18 -07:00
Chien Tung 1b9493248c RDMA/nes: Fix SFP+ PHY initialization
SFP+ PHY initialization has very long delays, incorrect settings for
direct attach copper cables, and inconsistent link detection.

Adjust delays to the minimum required by the PHY.  Worst case is now
less than 4 seconds.  Add new register settings for direct attach
cables.  Change link detection logic to use two new registers for more
consistent link state detection.  Reorganize code to shorten line
length.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:27:09 -07:00
Faisal Latif 5962c2c803 RDMA/nes: Fix nes_nic_cm_xmit() error handling
We are getting crash or hung situation when we are running network
cable pull tests during RDMA traffic.

In schedule_nes_timer(), we return an error if nes_nic_cm_xmit()
returns failure.  This is changed to success as skb is being put on
the timer routines to be processed later.  In send_syn() case, we are
indicating connect failure once from nes_connect() and the other when
the rexmit retries expires.

The other issue is skb->users which we are incrementing before calling
nes_nic_cm_xmit() which calls dev_queue_xmit() but in case of failure
we are decrementing the skb->users at the same time putting the skb on
the rexmit path.  Even if dev_queue_xmit() fails, the skb->users is
decremented already.  We are removing the decrement of skb->users in
case of failure from both schedule_nes_timer() as well as from
nes_cm_timer_tick().

There is also extra check in nes_cm_timer_tick() for rexmit failure
which does a break from the loop is removed.  This causes problem as
the other nodes have their cm_node->ref_count incremented and are not
processed.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:23:55 -07:00
Faisal Latif 79fc3d7410 RDMA/nes: Fix error handling issues
Fix issues found by static code analysis:

(1) Check if cm_node was successfully created for loopback connection.

(2) schedule_nes_timer() does not free up allocated memory after
    encountering an error.  There is a WARN_ON() for this condition.

(3) there is a cm_node->freed flag which is set but not used.

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:22:20 -07:00
Don Wood 7a5efb62f6 RDMA/nes: Fix incorrect casts on 32-bit architectures
The were some incorrect casts to unsigned long that caused 64-bit values
to be truncated on 32-bit architectures and made the driver pass invalid
adresses and lengths to the hardware.  The problems were primarily seen
with kernels with highmem configured but some could show up in
non-highmem kernels, too.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:21:02 -07:00
Yang Hongyang 284901a90a dma-mapping: replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:11 -07:00
Yang Hongyang 6a35528a83 dma-mapping: replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)
Replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:10 -07:00
Steve Wise 874d8df5ed RDMA/cxgb3: Release dependent resources only when endpoint memory is freed.
The cxgb3 l2t entry, hwtid, and dst entry were being released before
all the iwch_ep references were released.  This can cause a crash in
t3_l2t_send_slow() and other places where the l2t entry is used.

The fix is to defer releasing these resources until all endpoint
references are gone.

Details:

- move flags field to the iwch_ep_common struct.
- add a flag indicating resources are to be released.
- release resources at endpoint free time instead of close/abort time.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-30 08:37:59 -07:00
Steve Wise 04b5d028f5 RDMA/cxgb3: Handle EEH events
- wrap calls into cxgb3 and fail them if we're in the middle
  of a PCI EEH event.

- correctly unwind and release endpoint and other resources when
  we are in an EEH event.

- dispatch IB_EVENT_DEVICE_FATAL event when cxgb3 notifies iw_cxgb3 of
  a fatal error.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-30 08:37:56 -07:00
Roland Dreier e1d60ec669 IB/mlx4: Use pgprot_writecombine() for BlueFlame pages
The PAT work on x86 has finally made pgprot_writecombine() a usable API
for modular drivers.  As the comment indicates, this is exactly what we
want to use in mlx4_ib to map BlueFlame pages up to userspace, since
using WC for these pages improves small message latency significantly.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-30 08:31:05 -07:00
Roland Dreier 7c757eb9f8 RDMA/nes: Fix mis-merge
When net-next and infiniband were merged upstream, each branch deleted
one of a pair of adjacent lines from nes_nic.c, but when Linus fixed the
conflict up, he brought back both of the lines.  Fix up to the intended
final tree state.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-26 17:00:25 -07:00
Linus Torvalds 6671de344c Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
  posix timers: fix RLIMIT_CPU && fork()
  time: ntp: fix bug in ntp_update_offset() & do_adjtimex(), fix
  time: ntp: clean up second_overflow()
  time: ntp: simplify ntp_tick_adj calculations
  time: ntp: make 64-bit constants more robust
  time: ntp: refactor do_adjtimex() some more
  time: ntp: refactor do_adjtimex()
  time: ntp: fix bug in ntp_update_offset() & do_adjtimex()
  time: ntp: micro-optimize ntp_update_offset()
  time: ntp: simplify ntp_update_offset_fll()
  time: ntp: refactor and clean up ntp_update_offset()
  time: ntp: refactor up ntp_update_frequency()
  time: ntp: clean up ntp_update_frequency()
  time: ntp: simplify the MAX_TICKADJ_SCALED definition
  time: ntp: simplify the second_overflow() code flow
  time: ntp: clean up kernel/time/ntp.c
  x86: hpet: stop HPET_COUNTER when programming periodic mode
  x86: hpet: provide separate functions to stop and start the counter
  x86: hpet: print HPET registers during setup (if hpet=verbose is used)
  time: apply NTP frequency/tick changes immediately
  ...
2009-03-26 16:05:42 -07:00
Linus Torvalds 13220a94d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1750 commits)
  ixgbe: Allow Priority Flow Control settings to survive a device reset
  net: core: remove unneeded include in net/core/utils.c.
  e1000e: update version number
  e1000e: fix close interrupt race
  e1000e: fix loss of multicast packets
  e1000e: commonize tx cleanup routine to match e1000 & igb
  netfilter: fix nf_logger name in ebt_ulog.
  netfilter: fix warning in ebt_ulog init function.
  netfilter: fix warning about invalid const usage
  e1000: fix close race with interrupt
  e1000: cleanup clean_tx_irq routine so that it completely cleans ring
  e1000: fix tx hang detect logic and address dma mapping issues
  bridge: bad error handling when adding invalid ether address
  bonding: select current active slave when enslaving device for mode tlb and alb
  gianfar: reallocate skb when headroom is not enough for fcb
  Bump release date to 25Mar2009 and version to 0.22
  r6040: Fix second PHY address
  qeth: fix wait_event_timeout handling
  qeth: check for completion of a running recovery
  qeth: unregister MAC addresses during recovery.
  ...

Manually fixed up conflicts in:
	drivers/infiniband/hw/cxgb3/cxio_hal.h
	drivers/infiniband/hw/nes/nes_nic.c
2009-03-26 15:54:36 -07:00
Linus Torvalds 39b566eedb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (30 commits)
  RDMA/cxgb3: Enforce required firmware
  IB/mlx4: Unregister IB device prior to CLOSE PORT command
  mlx4_core: Add link type autosensing
  mlx4_core: Don't perform SET_PORT command for Ethernet ports
  RDMA/nes: Handle MPA Reject message properly
  RDMA/nes: Improve use of PBLs
  RDMA/nes: Remove LLTX
  RDMA/nes: Inform hardware that asynchronous event has been handled
  RDMA/nes: Fix tmp_addr compilation warning
  RDMA/nes: Report correct vendor_id and vendor_part_id
  RDMA/nes: Update copyright to new legal entity and year
  RDMA/nes: Account for freed PBL after HW operation
  IB: Remove useless ibdev_is_alive() tests from sysfs code
  IB/sa_query: Fix AH leak due to update_sm_ah() race
  IB/mad: Fix ib_post_send_mad() returning 0 with no generate send comp
  IB/mad: initialize mad_agent_priv before putting on lists
  IB/mad: Fix null pointer dereference in local_completions()
  IB/mad: Fix RMPP header RRespTime manipulation
  IB/iser: Remove hard setting of path MTU
  mlx4_core: Add device IDs for MT25458 10GigE devices
  ...
2009-03-26 15:47:08 -07:00
David S. Miller 08abe18af1 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	drivers/net/wimax/i2400m/usb-notif.c
2009-03-26 15:23:24 -07:00
Ingo Molnar 7c526e1fef Merge branches 'timers/new-apis', 'timers/ntp' and 'timers/urgent' into timers/core 2009-03-26 15:45:52 +01:00
Roland Dreier 09f98bafea Merge branches 'cxgb3', 'endian', 'ipath', 'ipoib', 'iser', 'mad', 'misc', 'mlx4', 'mthca', 'nes' and 'sysfs' into for-next 2009-03-24 20:44:41 -07:00
Steve Wise d1fbe04eee RDMA/cxgb3: Enforce required firmware
The cxgb3 NIC driver can handle more firmware versions than iw_cxgb3,
and since commit 8207befa ("cxgb3: untie strict FW matching") cxgb3
will load with firmware versions that iw_cxgb3 can't handle.  The FW
major number indicates a specific interface between the FW and
iw_cxgb3.  Thus if the major number of the running firmware does not
match the required version compiled into iw_cxgb3, then iw_cxgb3 must
not register that device.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-24 20:44:18 -07:00
Stephen Hemminger d0929553be infiniband: convert nes driver to net_device_ops
Also, removed unnecessary memset() since alloc_netdev returns
zeroed memory.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21 19:19:13 -07:00
Stephen Hemminger 687c75dcf3 infiniband: convert c2 to net_device_ops
Convert this driver to new net_device_ops infrastructure.
Also use default net_device get-stats infrastructure

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21 19:19:13 -07:00
Yevgeny Petrilin a6a47771b1 IB/mlx4: Unregister IB device prior to CLOSE PORT command
According to the ConnectX programmer's reference manual, all
operations should be stopped, all QPs should be torn down and all WQEs
flushed before the CLOSE_PORT command is invoked.  In some cases
reversing the order of operations (as implemented now) could cause
a loss of completions.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-18 19:49:54 -07:00
Faisal Latif c12e56ef69 RDMA/nes: Don't allow userspace QPs to use STag zero
STag zero is a special STag that allows consumers to access any bus
address without registering memory.  The nes driver unfortunately
allows STag zero to be used even with QPs created by unprivileged
userspace consumers, which means that any process with direct verbs
access to the nes device can read and write any memory accessible to
the underlying PCI device (usually any memory in the system).  Such
access is usually given for cluster software such as MPI to use, so
this is a local privilege escalation bug on most systems running this
driver.

The driver was using STag zero to receive the last streaming mode
data; to allow STag zero to be disabled for unprivileged QPs, the
driver now registers a special MR for this data.

Cc: <stable@kernel.org>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-12 16:21:41 -07:00
Faisal Latif 9d5ab13325 RDMA/nes: Handle MPA Reject message properly
While doing testing, there are failures as MPA Reject call is not
handled.  To handle MPA Reject call, following changes are done:

*Handle inbound/outbound MPA Reject response message.
	When nes_reject() is called for pending MPA request reply,
	send the MPA Reject message to its peer (active
	side)cm_node. The peer cm_node (active side) will indicate
	Reject message event for the pending Connect Request.

*Handle MPA Reject response message for loopback connections and listener.
	When MPA Request is rejected, check if it is a loopback
	connection and if it is then it will send Reject message event
	to its peer loopback node. Also when destroying listener,
	check if the cm_nodes for that listener are loopback or not.

*Add gracefull connection close with the MPA Reject response message.
	Send gracefull close (FIN, FIN ACK..) to terminate the cm_nodes.

*Some code re-org while making the above changes.
	Removed recv_list and recv_list_lock from the cm_node
	structure as there can be only one receive close entry on the
	timer. Also implemented handle_recv_entry() as receive close
	entry is processed from both nes_rem_ref_cm_node() as well as
	nes_cm_timer_tick().

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:15:01 -08:00
Don Wood 0145f341a9 RDMA/nes: Improve use of PBLs
Two level 256 byte PBLs was not implemented so the driver could report
out of memory when in fact there were PBLs still available.

This solution prefers to use 4KB PBLs over two level 256B PBLs until
the number of 4KB PBLs falls below a threshold.  At this point the 4KB
PBL structure is converted to use 256B PBLs which prevents the driver
from running out of 4KB PBLs too quickly.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:15:00 -08:00
Faisal Latif 2869975cfb RDMA/nes: Remove LLTX
NETIF_F_LLTX is deprecated. Remove private TX locking from the driver
and remove the NETIF_F_LLTX feature flag.  This also fixes a warning
in some configs that comes from doing skb_linearize() call in the
hard_start_xmit method with IRQs disabled (if HIGHMEM is enabled,
skb_linearize() may end up enabling BHs, which is a no-no if hard IRQs
are disabled in that context).  By getting rid of LLTX, we do not
disable IRQs when skb_linearize() is called.

Remove the sq_lock as it is not needed for non-LLTX.  Fix ethtool not
to show the counter for sq_lock.

Reported-by: aluno3@poczta.onet.pl
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:11 -08:00
Don Wood fd87778cb9 RDMA/nes: Inform hardware that asynchronous event has been handled
When asynchronous events are processed by software, it is necessary
to let the hardware know that software has handled the event.  This
frees up the entry in the asynchronous event queue.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:11 -08:00
Chien Tung 7b14ab0b43 RDMA/nes: Fix tmp_addr compilation warning
In find_node(), tmp_addr causes an "unused variable" warning when
INFINIBAND_NES_DEBUG is not defined.  It's only used in a nes_debug()
and the print does not make sense.  So take out the whole thing.

Reported-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:11 -08:00
Chien Tung b9c367e7e6 RDMA/nes: Report correct vendor_id and vendor_part_id
ibv_devinfo displays 0 for vendor_id and vendor_part_id.  Fill in OUI
and device_id for those two fields.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:10 -08:00
Chien Tung cd6853d3eb RDMA/nes: Update copyright to new legal entity and year
Update copyright to the new legal entity, Intel-NE, Inc., an Intel
company.  Update copyright for the new year.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:10 -08:00
Don Wood dae5d13a7e RDMA/nes: Account for freed PBL after HW operation
Fix occurrences where the software PBL counts were changed before the
hardware was updated.  This bug allowed another thread to overallocate
the hardware resources.

Add proper PBL accounting in case nes_reg_mr() fails.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:09 -08:00
Roland Dreier e538052746 IB/ipath: Really run work in ipath_release_user_pages_on_close()
ipath_release_user_pages_on_close() just allocated a structure to
schedule work with but just returned (leaking the structure) rather than 
actually doing schedule_work().  Fix the logic to what was intended.

This was spotted by the Coverity checker (CID 2700).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-22 20:14:37 -08:00
Roland Dreier 71c4512201 IB/ipath: Fix memory leak in init_shadow_tids() error path
If the second vmalloc() fails, the wrong pointer is pased to vfree(), so
the first vmalloc() ends up getting leaked.

This was spotted by the Coverity checker (CID 2709).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-22 20:04:34 -08:00
Ingo Molnar 74019224ac timers: add mod_timer_pending()
Impact: new timer API

Based on an idea from Martin Josefsson with the help of
Patrick McHardy and Stephen Hemminger:

introduce the mod_timer_pending() API which is a mod_timer()
offspring that is an invariant on already removed timers.

(regular mod_timer() re-activates non-pending timers.)

This is useful for the networking code in that it can
allow unserialized mod_timer_pending() timer-forwarding
calls, but a single del_timer*() will stop the timer
from being reactivated again.

Also while at it:

- optimize the regular mod_timer() path some more, the
  timer-stat and a debug check was needlessly duplicated
  in __mod_timer().

- make the exports come straight after the function, as
  most other exports in timer.c already did.

- eliminate __mod_timer() as an external API, change the
  users to mod_timer().

The regular mod_timer() code path is not impacted
significantly, due to inlining optimizations and due to
the simplifications.

Based-on-patch-from: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-18 19:26:33 +01:00
Steve Wise 4263289630 RDMA/cxgb3: Remove modulo math from build_rdma_recv()
Remove modulo usage to avoid a divide in the fast path (not all
gcc versions do strength reduction here).

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-16 21:23:32 -08:00
Steve Wise 42fb61f02f RDMA/cxgb3: Connection termination fixes
The poll and flush code needs to handle all send opcodes: SEND,
SEND_WITH_SE, SEND_WITH_INV, and SEND_WITH_SE_INV.

Ignore TERM indications if the connection already gone.

Ignore HW receive completions if the RQ is empty.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-10 16:38:57 -08:00
Steve Wise 900f4c16c3 RDMA/cxgb3: sgl/pbl offset calculation needs 64 bits
The variable 'offset' in iwch_sgl2pbl_map() needs to be a u64.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-10 16:38:22 -08:00
Moni Shoua 270b8b8513 IB/mthca: Fix dispatch of IB_EVENT_LID_CHANGE event
When snooping a PortInfo MAD, its client_reregister bit is checked.
If the bit is ON then a CLIENT_REREGISTER event is dispatched,
otherwise a LID_CHANGE event is dispatched.  This way of decision
ignores the cases where the MAD changes the LID along with an
instruction to reregister (so a necessary LID_CHANGE event won't be
dispatched) or the MAD is neither of these (and an unnecessary
LID_CHANGE event will be dispatched).

This causes problems at least with IPoIB, which will do a "light"
flush on reregister, rather than the "heavy" flush required due to a
LID change.

Fix this by dispatching a CLIENT_REREGISTER event if the
client_reregister bit is set, but also compare the LID in the MAD to
the current LID.  If and only if they are not identical then a
LID_CHANGE event is dispatched.

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-28 15:15:56 -08:00
Moni Shoua f0f6f346a1 IB/mlx4: Fix dispatch of IB_EVENT_LID_CHANGE event
When snooping a PortInfo MAD, its client_reregister bit is checked.
If the bit is ON then a CLIENT_REREGISTER event is dispatched,
otherwise a LID_CHANGE event is dispatched.  This way of decision
ignores the cases where the MAD changes the LID along with an
instruction to reregister (so a necessary LID_CHANGE event won't be
dispatched) or the MAD is neither of these (and an unnecessary
LID_CHANGE event will be dispatched).

This causes problems at least with IPoIB, which will do a "light"
flush on reregister, rather than the "heavy" flush required due to a
LID change.

Fix this by dispatching a CLIENT_REREGISTER event if the
client_reregister bit is set, but also compare the LID in the MAD to
the current LID.  If and only if they are not identical then a
LID_CHANGE event is dispatched.

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-28 14:54:35 -08:00
Divy Le Ray a73efd0a85 iw_cxgb3: handle chip reset notifications
Freeze activity when notified that the underlying chip
is getting reset on a EEH event or fatal error.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-26 22:22:19 -08:00
Ben Hutchings 288379f050 net: Remove redundant NAPI functions
Following the removal of the unused struct net_device * parameter from
the NAPI functions named *netif_rx_* in commit 908a7a1, they are
exactly equivalent to the corresponding *napi_* functions and are
therefore redundant.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21 14:33:50 -08:00
Harvey Harrison 9c3da09917 IB: Remove __constant_{endian} uses
The base versions handle constant folding just fine, use them
directly.  The replacements are OK in the include/ files as they are
not exported to userspace so we don't need the __ prefixed versions.

This patch does not affect code generation at all.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-17 17:11:57 -08:00
Roland Dreier ac8581d408 Merge branches 'ehca', 'ipoib' and 'mlx4' into for-linus 2009-01-16 15:05:54 -08:00
Stephen Rothwell ee96aae573 IB/ehca: Use consistent types for ehca_plpar_hcall9()
ehca_plpar_hcall9() takes an unsigned long array, so make all callers
pass that in.  This fixes warnings introduced by commit fe333321
("powerpc: Change u64/s64 to a long long integer type"), which changed
u64 from unsigned long to unsigned long long.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-16 14:55:40 -08:00
Stephen Rothwell 3750f60557 IB/ehca: Fix printk format warnings from u64 type change
Commit fe333321 ("powerpc: Change u64/s64 to a long long integer
type") changed u64 from unsigned long to unsigned long long, which
means that printk formats for printing u64 values should use "ll"
instead of "l" to avoid warnings.  Fix all the places affected by this
in ehca.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-16 14:55:28 -08:00
Roland Dreier 0fd7e1d855 IB/mlx4: Fix memory ordering problem when posting LSO sends
The current work request posting code writes the LSO segment before
writing any data segments.  This leaves a window where the LSO segment
overwrites the stamping in one cacheline that the HCA prefetches
before the rest of the cacheline is filled with the correct data
segments.  When the HCA processes this work request, a local
protection error may result.

Fix this by saving the LSO header size field off and writing it only
after all data segments are written.  This fix is a cleaned-up version
of a patch from Jack Morgenstein <jackm@dev.mellanox.co.il>.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1383>.

Reported-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-16 12:47:47 -08:00
Linus Torvalds ccbf04f24c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/iser: Add dependency on INFINIBAND_ADDR_TRANS
  IPoIB: Do not join broadcast group if interface is brought down
  RDMA/nes: Fix for NIPQUAD removal
  IPoIB: Fix loss of connectivity after bonding failover on both sides
  IB/mlx4: Don't register IB device for adapters with no IB ports
  mlx4_core: Fix warning from min()
  IB/ehca: spin_lock_irqsave() takes an unsigned long
2009-01-13 08:19:42 -08:00
Roland Dreier 8c9ea7fe96 Merge branches 'ehca', 'ipoib', 'iser', 'mlx4' and 'nes' into for-next 2009-01-12 19:37:31 -08:00
Harvey Harrison 03080e5cbe RDMA/nes: Fix for NIPQUAD removal
Commit 63779436 ("drivers: replace NIPQUAD()") accidentally replaced
some HIPQUAD()s, causing IP addresses to be printed in reverse order.
Add temporary local vars until the byteswapping can be pushed further
up the stack.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-10 21:45:42 -08:00
Roland Dreier 22e7ef9c08 IB/mlx4: Don't register IB device for adapters with no IB ports
If the mlx4_ib driver finds an adapter that has only ethernet ports, the
current code will register an IB device with 0 ports.  Nothing useful or
sensible can be done with such a device, so just skip registering it.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-09 13:22:29 -08:00
Coly Li 73ac36ea14 fix similar typos to successfull
When I review ocfs2 code, find there are 2 typos to "successfull".  After
doing grep "successfull " in kernel tree, 22 typos found totally -- great
minds always think alike :)

This patch fixes all the similar typos. Thanks for Randy's ack and comments.

Signed-off-by: Coly Li <coyli@suse.de>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:15 -08:00
Stephen Rothwell 7ddccb234c IB/ehca: spin_lock_irqsave() takes an unsigned long
The flags argument to spin_lock_irqsave() should really be unsigned
long.  This will also help prevent some warnings when we change u64 to
unsigned long long.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-07 11:24:36 -08:00
Frederik Schwarzer 025dfdafe7 trivial: fix then -> than typos in comments and documentation
- (better, more, bigger ...) then -> (...) than

Signed-off-by: Frederik Schwarzer <schwarzerf@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-01-06 11:28:06 +01:00
Al Viro 56ff5efad9 zero i_uid/i_gid on inode allocation
... and don't bother in callers.  Don't bother with zeroing i_blocks,
while we are at it - it's already been zeroed.

i_mode is not worth the effort; it has no common default value.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-05 11:54:28 -05:00
Rusty Russell 2ca1a61583 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	arch/x86/kernel/io_apic.c
2008-12-31 23:05:57 +10:30
Roland Dreier f781a22fa2 IB/mlx4: Fix reading SL field out of cqe->sl_vid
Commit f780a9f1 ("mlx4_core: Add ethernet fields to CQE struct")
introduced a bug in how wc->sl is set in mlx4_ib_poll_one() -- since
cqe->sl_vid is a big-endian value, the shift must be done after
converting to host endianness.

This bug was found using sparse endianness checking.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-30 15:30:26 -08:00
Rusty Russell cbe31f02f5 cpumask: use new cpumask API in drivers/infiniband/hw/ipath
Impact: cleanup

We're moving from handing around cpumask_t's to handing around struct
cpumask *'s.  cpus_*, cpumask_t and cpu_*_map are deprecated: convert
to cpumask_*, cpu_*_mask.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ralph Campbell <infinipath@qlogic.com>
2008-12-30 09:05:18 +10:30
Rusty Russell b29179c3d3 cpumask: use new cpumask API in drivers/infiniband/hw/ehca
Impact: cleanup

We're moving from handing around cpumask_t's to handing around struct
cpumask *'s.  cpus_*, cpumask_t and cpu_*_map are deprecated: convert
to cpumask_*, cpu_*_mask.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Tested-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Cc: Christoph Raisch <raisch@de.ibm.com>
2008-12-30 09:05:18 +10:30
Rusty Russell 259c4ddd00 cpumask: use for_each_online_cpu() in drivers/infiniband/hw/ehca/ehca_irq.c
Impact: cleanup

In future, accessing cpu numbers beyond nr_cpu_ids (the runtime limit)
will be undefined.  We can avoid future problems by using
for_each_online_cpu() here.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Tested-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Cc: Christoph Raisch <raisch@de.ibm.com>
2008-12-30 09:05:17 +10:30
Linus Torvalds 0191b625ca Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits)
  net: Allow dependancies of FDDI & Tokenring to be modular.
  igb: Fix build warning when DCA is disabled.
  net: Fix warning fallout from recent NAPI interface changes.
  gro: Fix potential use after free
  sfc: If AN is enabled, always read speed/duplex from the AN advertising bits
  sfc: When disabling the NIC, close the device rather than unregistering it
  sfc: SFT9001: Add cable diagnostics
  sfc: Add support for multiple PHY self-tests
  sfc: Merge top-level functions for self-tests
  sfc: Clean up PHY mode management in loopback self-test
  sfc: Fix unreliable link detection in some loopback modes
  sfc: Generate unique names for per-NIC workqueues
  802.3ad: use standard ethhdr instead of ad_header
  802.3ad: generalize out mac address initializer
  802.3ad: initialize ports LACPDU from const initializer
  802.3ad: remove typedef around ad_system
  802.3ad: turn ports is_individual into a bool
  802.3ad: turn ports is_enabled into a bool
  802.3ad: make ntt bool
  ixgbe: Fix set_ringparam in ixgbe to use the same memory pools.
  ...

Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due
to the conversion to %pI (in this networking merge) and the addition of
doing IPv6 addresses (from the earlier merge of CIFS).
2008-12-28 12:49:40 -08:00
David S. Miller 2d5451d261 net: Fix warning fallout from recent NAPI interface changes.
When we removed the network device argument from several
NAPI interfaces in 908a7a16b8
("net: Remove unused netdev arg from some NAPI interfaces.")
several drivers now started getting unused variable warnings.

This fixes those up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-26 15:10:00 -08:00
Roland Dreier 2a0d8366dd Merge branches 'cma', 'ehca', 'ipath', 'iser', 'mlx4' and 'nes' into for-next 2008-12-24 20:35:42 -08:00
Jack Morgenstein 7798dbf40a IB/mlx4: Set ownership bit correctly when copying CQEs during CQ resize
When resizing a CQ, when copying over unpolled CQEs from the old CQE
buffer to the new buffer, the ownership bit must be set appropriately
for the new buffer, or the ownership bit in the new buffer gets
corrupted.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-24 20:32:42 -08:00
Faisal Latif e189062a8c RDMA/nes: Remove tx_free_list
There is no lock protecting tx_free_list thus causing a system crash
when skb_dequeue() is called and the list is empty.  Since it did not give
any performance boost under heavy load, remove it to simplify the code.
Replace get_free_pkt() with dev_alloc_skb() to allocate MAX_CM_BUFFER skb
for connection establishment/teardown as well as MPA request/response.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-24 20:30:04 -08:00
Neil Horman 908a7a16b8 net: Remove unused netdev arg from some NAPI interfaces.
When the napi api was changed to separate its 1:1 binding to the net_device
struct, the netif_rx_[prep|schedule|complete] api failed to remove the now
vestigual net_device structure parameter.  This patch cleans up that api by
properly removing it..

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-22 20:43:12 -08:00
Yevgeny Petrilin b8dd786f94 mlx4_core: Add support for multiple completion event vectors
When using MSI-X mode, create a completion event queue for each CPU.
Report the number of completion EQs in a new struct mlx4_caps member,
num_comp_vectors, and extend the mlx4_cq_alloc() interface with a
vector parameter so that consumers can specify which completion EQ
should be used to report events for the CQ being created.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-22 07:15:03 -08:00
Julia Lawall 139cdab0a2 IB/ehca: Remove redundant test of vpage
vpage is checked not to be NULL just after it is initialized at the
beginning of each loop iteration.

A simplified version of the semantic patch that makes this change is
as follows: (http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r exists@
local idexpression x;
expression E;
position p1,p2;
@@

if (x@p1 == NULL || ...) { ... when forall
   return ...; }
... when != \(x=E\|x--\|x++\|--x\|++x\|x-=E\|x+=E\|x|=E\|x&=E\|&x\)
(
x@p2 == NULL
|
x@p2 != NULL
)

// another path to the test that is not through p1?
@s exists@
local idexpression r.x;
position r.p1,r.p2;
@@

... when != x@p1
(
x@p2 == NULL
|
x@p2 != NULL
)

@fix depends on !s@
position r.p1,r.p2;
expression x,E;
statement S1,S2;
@@

(
- if ((x@p2 != NULL) || ...)
  S1
|
- if ((x@p2 == NULL) && ...) S1
|
- BUG_ON(x@p2 == NULL);
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-21 13:29:13 -08:00
Stefan Roscher 1c721940dd IB/ehca: Replace modulus operations in flush error completion path
With the latest flush error completion patch we introduced modulus
operation to calculate the next index within a qmap.  Based on
comments from other mailing lists we decided to optimize this
operation by using an addition and an if-statement instead of modulus,
even though this is on the error path.

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:25:38 -08:00
Dave Olson 3d0890985a IB/ipath: Add locking for interrupt use of ipath_pd contexts vs free
Fixes timing race resulting in panic.  Not a performance sensitive path.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:14:38 -08:00
Dave Olson 1bf7724e09 IB/ipath: Fix spi_pioindex value
ipath_piobufbase was a single value offset, but is multiple values on
newer chips, so use only the 32 bits for the 2K buffers (4K buffers
are currently used only by the driver).

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:13:19 -08:00
Dave Olson 6114d4cd31 IB/ipath: Only do 1X workaround on rev1 chips
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:13:19 -08:00
Dave Olson 60e845035a IB/ipath: Don't count IB symbol and link errors unless link is UP
Implement the ignoring of ibsymbol errors and linkrecover errors while
the link is at less than INIT (long needed), to get accurate counts.
Particularly an issue when doing non-IBTA DDR negotiation with chips
from vendors that do not support IBTA mode negotiation.  If the driver
is unloaded, and there is a delta, the adjusted counters are written
back to the chip, so they stay adjusted across driver reload.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:13:19 -08:00
Ralph Campbell 890fccb242 IB/ipath: Check return value of dma_map_single()
This fixes an obvious oversight where the return value is not checked
for error.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:13:18 -08:00
Ralph Campbell fab01fc560 IB/ipath: Fix PSN of send WQEs after an RDMA read resend
The PSN of the first packet after an RDMA read is based on the size of
the RDMA read request. This is calculated correctly for the WQE sent
after the first request message but not on subsequent requests if the
RDMA read is resent.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:13:18 -08:00
Chien Tung 6098d10749 RDMA/nes: Cleanup warnings
Wrap NES_DEBUG and assert macros with do while (0) to avoid ambiguous
else.  No one is using sk_buff * returned from form_cm_frame(), so
drop the return.  drop_packet() should not be incrementing reset
counter on receiving a FIN.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:00:41 -08:00
Chien Tung 1ee86555b2 RDMA/nes: Add loopback check to make_cm_node()
Check for loopback connection in make_cm_node().

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:00:29 -08:00
Faisal Latif f3181a10e1 RDMA/nes: Check cqp_avail_reqs is empty after locking the list
Between the first empty list check and locking the list, the list can
change.  Check it again after it is locked to make sure the list is
still not empty.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:00:24 -08:00
Faisal Latif abb7725676 RDMA/nes: Fix TCP compliance test failures
ANVL testing showed we are not handling all cm_node states during
connection establishment.  Add missing state handlers and fix sequence
number send reset in handle_tcp_options().

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:00:19 -08:00
Faisal Latif 4a14f6a79f RDMA/nes: Forward packets for a new connection with stale APBVT entry
Under heavy traffic, there is a small windows when an APBVT entry is
not yet removed and a new connection is established.  Packets for the
new connection are dropped until APBVT entry is removed.  This patch
will forward the packets instead of dropping them.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:00:13 -08:00
Faisal Latif 183ecfa309 RDMA/nes: Avoid race between MPA request and reset event to rdma_cm
In passive open, after indicating MPA request to rdma_cm, an incoming
RST would fire a reset event to rdma_cm causing it to crash, since the
current state is not connected.  The solution is to wait for
nes_accept() or nes_reject() before firing the reset event.  If
nes_accept() or nes_reject() is already done, then the reset event
will be fired when RST is processed.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:00:08 -08:00
Faisal Latif 879e5bd5a1 RDMA/nes: Lock down connected_nodes list while processing it
While processing connected_nodes list, we would release the lock when
we need to send reset to remote partner.  That created a window where
the list can be modified.  Change this into a two step process: place
nodes that need processing on a local list then process the local list.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 11:00:02 -08:00
Faisal Latif c5d321e5c9 RDMA/nes: Cleanup cqp_request list usage
Use nes_free_cqp_request() instead of open coding.  Change some
continue to break in nes_cm_timer_tick, because send_entry used to be
a list processed in a loop (so continue went to the next item).  Now
it is a single item, so using break is correct.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-05 10:59:53 -08:00
David S. Miller aa2ba5f108 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/ixgbe/ixgbe_main.c
	drivers/net/smc91x.c
2008-12-02 19:50:27 -08:00
Ralph Campbell 7c37d74474 IB/ipath: Improve UD loopback performance by allocating temp array only once
Receive work queue entries are checked for L_Key validity, and
pointers to the memory region structure are saved in an allocated
structure.  For UD loopback packets, this structure is allocated and
freed for each packet.  This patch changes that to allocate/free
during QP creation and destruction.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-01 20:59:08 -08:00
Michael Ellerman 64f22fa17c IB/ipath: Fix pointer-to-pointer thinko in ipath_fs.c
The return from lookup_one_len() is assigned to *dentry, so that's
what we should be checking with IS_ERR().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-01 20:59:07 -08:00