Commit Graph

1361 Commits (02eb41ef2a8631022fd90e096c57562dec9e7a9a)

Author SHA1 Message Date
Roel Kluin af04662b4d IB/ehca: Ensure that guid_entry index is not negative
This prevents the memcpy() of a guid_entries element using a negative index.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 22:23:48 -07:00
Hannes Hering 0cf89dcdbc IB/ehca: Tolerate dynamic memory operations before driver load
Implement toleration of dynamic memory operations and 16 GB gigantic
pages, where "toleration" means that the driver can cope with dynamic
memory operations that happen before the driver is loaded.  While the
ehca driver is loaded, dynamic memory operations are still prohibited
by returning NOTIFY_BAD from the memory notifier.

On module load the driver walks through available system memory,
checks for available memory ranges and then registers the kernel
internal memory region accordingly.  The translation of address ranges
is implemented via a 3-level busmap.

Signed-off-by: Hannes Hering <hering2@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 22:18:51 -07:00
Greg Kroah-Hartman f899c2ddd4 infiniband: ehca: remove driver_data direct access of struct device
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device.  Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used.  These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.

Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: general@lists.openfabrics.org
Cc: Christoph Raisch <raisch@de.ibm.com>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:27 -07:00
Roland Dreier 8d34ff3401 Merge branches 'cxgb3', 'ehca', 'misc', 'mlx4', 'mthca' and 'nes' into for-linus 2009-06-14 13:31:19 -07:00
Roland Dreier 9aa0a489d9 IB/mthca: Don't double-free IRQs when falling back from MSI-X to INTx
When both MSI-X and legacy INTx fail to generate an interrupt, the
driver frees the MSI-X interrupts twice.  Fix this by clearing the
have_irq flag for the MSI-X interrupts when they are freed the first
time.

Reported-by: Yinghai Lu <yhlu.kernel@gmail.com>
Tested-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-13 15:14:09 -07:00
Jack Morgenstein 2ac6bf4ddc IB/mlx4: Add strong ordering to local inval and fast reg work requests
The ConnectX Programmer's Reference Manual states that the "SO" bit
must be set when posting Fast Register and Local Invalidate send work
requests.  When this bit is set, the work request will be executed
only after all previous work requests on the send queue have been
executed.  (If the bit is not set, Fast Register and Local Invalidate
WQEs may begin execution too early, which violates the defined
semantics for these operations)

This fixes the issue with NFS/RDMA reported in
<http://lists.openfabrics.org/pipermail/general/2009-April/059253.html>

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-05 10:36:24 -07:00
Joachim Fenkes 25a5239327 IB/ehca: Remove superfluous bitmasks from QP control block
All the fields in the control block are nicely right-aligned, so no
masking is necessary.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-03 13:25:42 -07:00
Steve Wise 3026c19a14 RDMA/cxgb3: Limit fast register size based on T3 limitations
T3 firmware only supports one WRs worth of page list for fast register
work requests.  The driver currently allows 2 WRs worth, which
doesn't work for T3, so reduce the limit in the driver.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-27 14:43:39 -07:00
Steve Wise 7ab1a2b31d RDMA/cxgb3: Report correct port state and MTU
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-27 14:42:36 -07:00
Eli Cohen c1f67a88bf IB/mthca: Add module parameter for number of MTTs per segment
The current MTT allocator uses kmalloc() to allocate a buffer for its
buddy allocator, and thus is limited in the amount of MTT segments
that it can control.  As a result, the size of memory that can be
registered is limited too.  This patch uses a module parameter to
control the number of MTT entries that each segment represents,
allowing more memory to be registered with the same number of
segments.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-27 14:36:16 -07:00
Roel Kluin 28e43a519b RDMA/nes: Fix off-by-one bugs in reset_adapter_ne020() and init_serdes()
With a postfix increment, i is incremented one past 10K/5K before the
loop ends, so the error messages will be displayed too soon if the
test succeeds on the last iteration.  Fix the comparisons to be >
instead of >=.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-15 10:16:45 -07:00
Jack Stone 5b891a9332 infiniband: Remove void casts
Remove uneeded casts of void *.

Signed-off-by: Jack Stone <jwjstone@fastmail.fm>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:53:39 -07:00
Stefan Roscher bde2cfaf8f IB/ehca: Increment version number
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:43 -07:00
Stefan Roscher 1988d1fa1a IB/ehca: Remove unnecessary memory operations for userspace queue pairs
The queue map for flush completion circumvention is only used for
kernel space queue pairs.  This patch skips the allocation of the
queue maps in case the QP is created for userspace.  In addition, this
patch does not iomap the galpas for kernel usage if the queue pair is
only used in userspace.  These changes will improve the performance of
creation of userspace queue pairs.

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:43 -07:00
Stefan Roscher c94f156f63 IB/ehca: Fall back to vmalloc() for big allocations
In case of large queue pairs there is the possibillity of allocation
failures due to memory fragmentation when using kmalloc().  To ensure
the memory is allocated even if kmalloc() can not find chunks which
are big enough, we fall back to allocating the memory with vmalloc().

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:42 -07:00
Anton Blanchard bf31a1a02e IB/ehca: Replace vmalloc() with kmalloc() for queue allocation
To improve performance of driver resource allocation, replace
vmalloc() calls with kmalloc().

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:40 -07:00
Linus Torvalds c98861f7de Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mlx4: Don't overwrite fast registration page list when posting work request
  RDMA/cxgb3: Don't complete flushed send work requests twice
2009-05-13 16:31:12 -07:00
Roland Dreier 8be741b0ac Merge branches 'cxgb3' and 'mlx4' into for-linus 2009-05-13 15:16:17 -07:00
Al Viro 265e771e81 Fix deadlock in ipathfs ->get_sb()
forgot to unlock superblock before calling deactivate_super()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09 10:49:40 -04:00
Jack Morgenstein 2b6b7d4be4 IB/mlx4: Don't overwrite fast registration page list when posting work request
The low-level mlx4 driver modified the page-list addresses for fast
register work requests post send to big-endian, and set a "present"
bit.  This caused problems later when the consumer attempted to unmap
the pages using the page-list (using the list addresses which were
assumed to be still in CPU-endian order).  Fix the mlx4 driver to
allocate two buffers and use a private buffer for the hardware-format
bus addresses.

This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1571>,
an NFS/RDMA server crash.  The cause of the crash was found by Vu Pham
of Mellanox.  The fix is along the lines suggested by Steve Wise in
comment #21 in bug 1571.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-07 21:35:13 -07:00
Steve Wise ec6995ddaa RDMA/cxgb3: Don't complete flushed send work requests twice
When the SQ is flushed, mark the flushed entries as not signaled so
the poll logic doesn't re-insert the CQ entry thinking its an out of
order completion.

The bug can cause the NFS/RDMA server to crash due to processing the
same completed work request twice.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-29 15:15:59 -07:00
Roland Dreier 9308f96c79 Merge branches 'cxgb3', 'ipoib', 'mthca', 'mlx4' and 'nes' into for-linus 2009-04-28 16:01:31 -07:00
Chien Tung 26cc5e57bb RDMA/nes: Update iw_nes version
Update version number to 1.5.0.0

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:46:29 -07:00
Faisal Latif 9256b25130 RDMA/nes: Fix error path in nes_accept()
If reg_phys_mem() fails, we need to free memory allocated for MPA
frame with private data before returning the error. Also move
nes_add_ref() after the reg_phys_mem() is successful.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:45:19 -07:00
Faisal Latif 109d67e4f1 RDMA/nes: Fix hang issues for large cluster dynamic connections
Running large cluster setup, we are hanging after many hours of
testing.  Fixing this required going over the code and making sure the
rexmit entry was properly removed based on the cm_node's state and
packet received.  Also when receiving a FIN packet, check seq# and
make sure there were no errors before calling handle_fin().

Following are the changes done in nes_cm.c:

* handle_ack_pkt() needs to return error value, so in case of error,
  handle_fin() is not called. Some cleanup done while going over the code.

* handle_rst_pkt(), handling of cm_node's NES_CM_STATE_LAST_ACK is missing.

* process_packet(), in case of FIN only packet is received, call
  check_seq() before processing.

* in handle_fin_pkt(), we are calling cleanup_retrans_entry() for all
  conditions, even if the packets need to be dropped.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:41:06 -07:00
Faisal Latif 4e9c390036 RDMA/nes: Increase rexmit timeout interval
Under heavy load with large cluster testing, it may take longer to
receive a response to MPA requests.  Change the driver to wait longer
after each rexmit to max time value.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:39:36 -07:00
Faisal Latif c11470f9f4 RDMA/nes: Check for sequence number wrap-around
check_seq() was not checking if the seq#s have wrapped.  Fix it.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:38:31 -07:00
Faisal Latif 53094c388f RDMA/nes: Do not set apbvt entry for loopback
When a connect request comes, apbvt should only be set for
non-loopback connections.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:37:34 -07:00
Chien Tung 1f0dba1e51 RDMA/nes: Fix unused variable compile warning when INFINIBAND_NES_DEBUG=n
Remove the NES_DEBUG that is causing the compile warning about an
unused variable when INFINIBAND_NES_DEBUG is not enabled.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:36:03 -07:00
Chien Tung 0e4562da9e RDMA/nes: Fix fw_ver in /sys
/sys/class/infiniband/nes?/fw_ver is not displaying firmware version
properly (it shows 0.0.0 with the current code).  Fill in the correct
firmware version number.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:33:48 -07:00
Chien Tung 923223776b RDMA/nes: Set trace length to 1 inch for SFP_D
With updated PHY firmware for SFP_D, setting the trace length to 1
inch for SFP_D provides a more stable link.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:30:35 -07:00
Chien Tung e998c25bc2 RDMA/nes: Enable repause timer for port 1
Enable repause timer for port 1.  Without this setting, under stress,
the chip may misbehave.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:29:42 -07:00
Chien Tung 366835e249 RDMA/nes: Correct CDR loop filter setting for port 1
In commit 1b949324 ("RDMA/nes: Fix SFP+ PHY initialization") there is
a mistake in the clean up code that removed port 1 CDR loop filter
settings for 10G cards other than CX4.  Put the correct setting back
for appropriate PHY types.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:28:41 -07:00
Chien Tung 010db4d127 RDMA/nes: Modify thermo mitigation to flip SerDes1 ref clk to internal
Change thermo mitigation code to flip the SerDes1 reference clock to
internal, to match the change in commit a4849fc1 ("RDMA/nes: Add
wide_ppm_offset parm for switch compatibility").

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:27:21 -07:00
Miroslaw Walukiewicz 5d1af5c832 RDMA/nes: Fix resource issues in nes_create_cq() and nes_destroy_cq()
In error paths where a CQ is not created, pbl is not freeed properly.

In nes_destroy_cq(), add the corresponding check for nescq->mcrqf to
not call nes_free_resource() when it is already done in nes_create_cq().

Signed-off-by: Miroslaw Walukiewicz <miroslaw.walukiewicz@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-21 16:16:48 -07:00
Matt Kraai cc005fa20c RDMA/nes: Remove root_256()'s unused pbl_count_256 parameter
Signed-off-by: Matt Kraai <kraai@ftbfs.org>
Acked-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-21 10:43:21 -07:00
Jack Morgenstein 8531f1f14a IB/mthca: Fix timeout for INIT_HCA and a few other commands
Commands INIT_HCA, CLOSE_HCA, SYS_EN, SYS_DIS, and CLOSE_IB all have 1
second timeouts.  For INIT_HCA this causes problems when had more than
2^18 are QPs configured, since the command takes more than 1 second to
complete.

All other commands have 60-second timeouts.  This patch makes the
above commands consistent with the rest of the commands (and with the
chip documentation).

This patch is an expansion of a patch from Arthur Kepner
<akepner@sgi.com> fixing just the INIT_HCA timeout.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 21:12:25 -07:00
Steve Wise cde9e2f930 RDMA/cxgb3: Don't zero QP attrs when moving to IDLE
QP attributes must stay initialized when moving back to IDLE.  Zeroing
them will crash the system in _flush_qp() if the QP is subsequently
moved to ERROR and back to IDLE.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 17:00:53 -07:00
Don Wood 3f32eb1185 RDMA/nes: Fix bugs in nes_reg_phys_mr()
The code incorrectly failed memory registration if the buffer was not
page aligned.  Also, the length field is mangled causing the hardware
to think the registration is much larger than it really is.

The fix is to remove the page alignment restriction as well the
incorrect length adjustment.  Also make sure that all buffers after
the first start at a page boundary, and all buffers except the last
end on a page boundary.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 14:53:00 -07:00
Chien Tung 1af9222b52 RDMA/nes: Fix compiler warning at nes_verbs.c:1955
Initialize pbl_count_256 to 0 to get rid of the warning:

    drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_reg_mr':
    drivers/infiniband/hw/nes/nes_verbs.c:1955: warning: 'pbl_count_256' may be used uninitialized in this function

Reported-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 14:50:36 -07:00
Steve Wise 96ac7e8892 RDMA/cxgb3: Adjust ORD/IRD (if needed) for peer2peer connections
NFS/RDMA currently fails to set up connections if peer2peer is on.
This is due to the fact that the NFS/RDMA client sets its ORD to 0.

If peer2peer is set, make sure the active side ORD is >= 1 and the
passive side IRD is >=1.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 13:53:15 -07:00
Linus Torvalds 0534c8cb5c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/nes: Add support for new SFP+ PHY
  RDMA/nes: Add wide_ppm_offset parm for switch compatibility
  RDMA/nes: Fix SFP+ PHY initialization
  RDMA/nes: Fix nes_nic_cm_xmit() error handling
  RDMA/nes: Fix error handling issues
  RDMA/nes: Fix incorrect casts on 32-bit architectures
  IPoIB: Document newish features
  RDMA/cma: Create cm id even when IB port is down
  RDMA/cma: Use rate from IPoIB broadcast when joining IPoIB multicast groups
  IPoIB: Avoid free_netdev() BUG when destroying a child interface
  mlx4_core: Don't leak mailbox for SET_PORT on Ethernet ports
  RDMA/cxgb3: Release dependent resources only when endpoint memory is freed.
  RDMA/cxgb3: Handle EEH events
  IB/mlx4: Use pgprot_writecombine() for BlueFlame pages
2009-04-09 16:42:26 -07:00
Roland Dreier 07306c0b98 Merge branches 'cma', 'cxgb3', 'ipoib', 'mlx4' and 'nes' into for-next 2009-04-08 14:28:21 -07:00
Chien Tung 4303565df4 RDMA/nes: Add support for new SFP+ PHY
Add new register settings for new SFP+ PHY/firmware.
Add new PHY to to nes_netdev_get/set_settings.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:27:56 -07:00
Chien Tung a4849fc157 RDMA/nes: Add wide_ppm_offset parm for switch compatibility
We have observed unstable link with a new BNT switch.

Add wide_ppm_offset parameter to allow the user to control the clock
ppm offset on the CX4 interface for better compatibility.  Default is
100ppm, setting it to 1 will increase it to 300ppm.  Change default
SerDes1 reference clock to external source.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:27:18 -07:00
Chien Tung 1b9493248c RDMA/nes: Fix SFP+ PHY initialization
SFP+ PHY initialization has very long delays, incorrect settings for
direct attach copper cables, and inconsistent link detection.

Adjust delays to the minimum required by the PHY.  Worst case is now
less than 4 seconds.  Add new register settings for direct attach
cables.  Change link detection logic to use two new registers for more
consistent link state detection.  Reorganize code to shorten line
length.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:27:09 -07:00
Faisal Latif 5962c2c803 RDMA/nes: Fix nes_nic_cm_xmit() error handling
We are getting crash or hung situation when we are running network
cable pull tests during RDMA traffic.

In schedule_nes_timer(), we return an error if nes_nic_cm_xmit()
returns failure.  This is changed to success as skb is being put on
the timer routines to be processed later.  In send_syn() case, we are
indicating connect failure once from nes_connect() and the other when
the rexmit retries expires.

The other issue is skb->users which we are incrementing before calling
nes_nic_cm_xmit() which calls dev_queue_xmit() but in case of failure
we are decrementing the skb->users at the same time putting the skb on
the rexmit path.  Even if dev_queue_xmit() fails, the skb->users is
decremented already.  We are removing the decrement of skb->users in
case of failure from both schedule_nes_timer() as well as from
nes_cm_timer_tick().

There is also extra check in nes_cm_timer_tick() for rexmit failure
which does a break from the loop is removed.  This causes problem as
the other nodes have their cm_node->ref_count incremented and are not
processed.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:23:55 -07:00
Faisal Latif 79fc3d7410 RDMA/nes: Fix error handling issues
Fix issues found by static code analysis:

(1) Check if cm_node was successfully created for loopback connection.

(2) schedule_nes_timer() does not free up allocated memory after
    encountering an error.  There is a WARN_ON() for this condition.

(3) there is a cm_node->freed flag which is set but not used.

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:22:20 -07:00
Don Wood 7a5efb62f6 RDMA/nes: Fix incorrect casts on 32-bit architectures
The were some incorrect casts to unsigned long that caused 64-bit values
to be truncated on 32-bit architectures and made the driver pass invalid
adresses and lengths to the hardware.  The problems were primarily seen
with kernels with highmem configured but some could show up in
non-highmem kernels, too.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:21:02 -07:00
Yang Hongyang 284901a90a dma-mapping: replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:11 -07:00
Yang Hongyang 6a35528a83 dma-mapping: replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)
Replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:10 -07:00
Steve Wise 874d8df5ed RDMA/cxgb3: Release dependent resources only when endpoint memory is freed.
The cxgb3 l2t entry, hwtid, and dst entry were being released before
all the iwch_ep references were released.  This can cause a crash in
t3_l2t_send_slow() and other places where the l2t entry is used.

The fix is to defer releasing these resources until all endpoint
references are gone.

Details:

- move flags field to the iwch_ep_common struct.
- add a flag indicating resources are to be released.
- release resources at endpoint free time instead of close/abort time.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-30 08:37:59 -07:00
Steve Wise 04b5d028f5 RDMA/cxgb3: Handle EEH events
- wrap calls into cxgb3 and fail them if we're in the middle
  of a PCI EEH event.

- correctly unwind and release endpoint and other resources when
  we are in an EEH event.

- dispatch IB_EVENT_DEVICE_FATAL event when cxgb3 notifies iw_cxgb3 of
  a fatal error.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-30 08:37:56 -07:00
Roland Dreier e1d60ec669 IB/mlx4: Use pgprot_writecombine() for BlueFlame pages
The PAT work on x86 has finally made pgprot_writecombine() a usable API
for modular drivers.  As the comment indicates, this is exactly what we
want to use in mlx4_ib to map BlueFlame pages up to userspace, since
using WC for these pages improves small message latency significantly.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-30 08:31:05 -07:00
Roland Dreier 7c757eb9f8 RDMA/nes: Fix mis-merge
When net-next and infiniband were merged upstream, each branch deleted
one of a pair of adjacent lines from nes_nic.c, but when Linus fixed the
conflict up, he brought back both of the lines.  Fix up to the intended
final tree state.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-26 17:00:25 -07:00
Linus Torvalds 6671de344c Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
  posix timers: fix RLIMIT_CPU && fork()
  time: ntp: fix bug in ntp_update_offset() & do_adjtimex(), fix
  time: ntp: clean up second_overflow()
  time: ntp: simplify ntp_tick_adj calculations
  time: ntp: make 64-bit constants more robust
  time: ntp: refactor do_adjtimex() some more
  time: ntp: refactor do_adjtimex()
  time: ntp: fix bug in ntp_update_offset() & do_adjtimex()
  time: ntp: micro-optimize ntp_update_offset()
  time: ntp: simplify ntp_update_offset_fll()
  time: ntp: refactor and clean up ntp_update_offset()
  time: ntp: refactor up ntp_update_frequency()
  time: ntp: clean up ntp_update_frequency()
  time: ntp: simplify the MAX_TICKADJ_SCALED definition
  time: ntp: simplify the second_overflow() code flow
  time: ntp: clean up kernel/time/ntp.c
  x86: hpet: stop HPET_COUNTER when programming periodic mode
  x86: hpet: provide separate functions to stop and start the counter
  x86: hpet: print HPET registers during setup (if hpet=verbose is used)
  time: apply NTP frequency/tick changes immediately
  ...
2009-03-26 16:05:42 -07:00
Linus Torvalds 13220a94d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1750 commits)
  ixgbe: Allow Priority Flow Control settings to survive a device reset
  net: core: remove unneeded include in net/core/utils.c.
  e1000e: update version number
  e1000e: fix close interrupt race
  e1000e: fix loss of multicast packets
  e1000e: commonize tx cleanup routine to match e1000 & igb
  netfilter: fix nf_logger name in ebt_ulog.
  netfilter: fix warning in ebt_ulog init function.
  netfilter: fix warning about invalid const usage
  e1000: fix close race with interrupt
  e1000: cleanup clean_tx_irq routine so that it completely cleans ring
  e1000: fix tx hang detect logic and address dma mapping issues
  bridge: bad error handling when adding invalid ether address
  bonding: select current active slave when enslaving device for mode tlb and alb
  gianfar: reallocate skb when headroom is not enough for fcb
  Bump release date to 25Mar2009 and version to 0.22
  r6040: Fix second PHY address
  qeth: fix wait_event_timeout handling
  qeth: check for completion of a running recovery
  qeth: unregister MAC addresses during recovery.
  ...

Manually fixed up conflicts in:
	drivers/infiniband/hw/cxgb3/cxio_hal.h
	drivers/infiniband/hw/nes/nes_nic.c
2009-03-26 15:54:36 -07:00
Linus Torvalds 39b566eedb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (30 commits)
  RDMA/cxgb3: Enforce required firmware
  IB/mlx4: Unregister IB device prior to CLOSE PORT command
  mlx4_core: Add link type autosensing
  mlx4_core: Don't perform SET_PORT command for Ethernet ports
  RDMA/nes: Handle MPA Reject message properly
  RDMA/nes: Improve use of PBLs
  RDMA/nes: Remove LLTX
  RDMA/nes: Inform hardware that asynchronous event has been handled
  RDMA/nes: Fix tmp_addr compilation warning
  RDMA/nes: Report correct vendor_id and vendor_part_id
  RDMA/nes: Update copyright to new legal entity and year
  RDMA/nes: Account for freed PBL after HW operation
  IB: Remove useless ibdev_is_alive() tests from sysfs code
  IB/sa_query: Fix AH leak due to update_sm_ah() race
  IB/mad: Fix ib_post_send_mad() returning 0 with no generate send comp
  IB/mad: initialize mad_agent_priv before putting on lists
  IB/mad: Fix null pointer dereference in local_completions()
  IB/mad: Fix RMPP header RRespTime manipulation
  IB/iser: Remove hard setting of path MTU
  mlx4_core: Add device IDs for MT25458 10GigE devices
  ...
2009-03-26 15:47:08 -07:00
David S. Miller 08abe18af1 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	drivers/net/wimax/i2400m/usb-notif.c
2009-03-26 15:23:24 -07:00
Ingo Molnar 7c526e1fef Merge branches 'timers/new-apis', 'timers/ntp' and 'timers/urgent' into timers/core 2009-03-26 15:45:52 +01:00
Roland Dreier 09f98bafea Merge branches 'cxgb3', 'endian', 'ipath', 'ipoib', 'iser', 'mad', 'misc', 'mlx4', 'mthca', 'nes' and 'sysfs' into for-next 2009-03-24 20:44:41 -07:00
Steve Wise d1fbe04eee RDMA/cxgb3: Enforce required firmware
The cxgb3 NIC driver can handle more firmware versions than iw_cxgb3,
and since commit 8207befa ("cxgb3: untie strict FW matching") cxgb3
will load with firmware versions that iw_cxgb3 can't handle.  The FW
major number indicates a specific interface between the FW and
iw_cxgb3.  Thus if the major number of the running firmware does not
match the required version compiled into iw_cxgb3, then iw_cxgb3 must
not register that device.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-24 20:44:18 -07:00
Stephen Hemminger d0929553be infiniband: convert nes driver to net_device_ops
Also, removed unnecessary memset() since alloc_netdev returns
zeroed memory.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21 19:19:13 -07:00
Stephen Hemminger 687c75dcf3 infiniband: convert c2 to net_device_ops
Convert this driver to new net_device_ops infrastructure.
Also use default net_device get-stats infrastructure

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21 19:19:13 -07:00
Yevgeny Petrilin a6a47771b1 IB/mlx4: Unregister IB device prior to CLOSE PORT command
According to the ConnectX programmer's reference manual, all
operations should be stopped, all QPs should be torn down and all WQEs
flushed before the CLOSE_PORT command is invoked.  In some cases
reversing the order of operations (as implemented now) could cause
a loss of completions.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-18 19:49:54 -07:00
Faisal Latif c12e56ef69 RDMA/nes: Don't allow userspace QPs to use STag zero
STag zero is a special STag that allows consumers to access any bus
address without registering memory.  The nes driver unfortunately
allows STag zero to be used even with QPs created by unprivileged
userspace consumers, which means that any process with direct verbs
access to the nes device can read and write any memory accessible to
the underlying PCI device (usually any memory in the system).  Such
access is usually given for cluster software such as MPI to use, so
this is a local privilege escalation bug on most systems running this
driver.

The driver was using STag zero to receive the last streaming mode
data; to allow STag zero to be disabled for unprivileged QPs, the
driver now registers a special MR for this data.

Cc: <stable@kernel.org>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-12 16:21:41 -07:00
Faisal Latif 9d5ab13325 RDMA/nes: Handle MPA Reject message properly
While doing testing, there are failures as MPA Reject call is not
handled.  To handle MPA Reject call, following changes are done:

*Handle inbound/outbound MPA Reject response message.
	When nes_reject() is called for pending MPA request reply,
	send the MPA Reject message to its peer (active
	side)cm_node. The peer cm_node (active side) will indicate
	Reject message event for the pending Connect Request.

*Handle MPA Reject response message for loopback connections and listener.
	When MPA Request is rejected, check if it is a loopback
	connection and if it is then it will send Reject message event
	to its peer loopback node. Also when destroying listener,
	check if the cm_nodes for that listener are loopback or not.

*Add gracefull connection close with the MPA Reject response message.
	Send gracefull close (FIN, FIN ACK..) to terminate the cm_nodes.

*Some code re-org while making the above changes.
	Removed recv_list and recv_list_lock from the cm_node
	structure as there can be only one receive close entry on the
	timer. Also implemented handle_recv_entry() as receive close
	entry is processed from both nes_rem_ref_cm_node() as well as
	nes_cm_timer_tick().

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:15:01 -08:00
Don Wood 0145f341a9 RDMA/nes: Improve use of PBLs
Two level 256 byte PBLs was not implemented so the driver could report
out of memory when in fact there were PBLs still available.

This solution prefers to use 4KB PBLs over two level 256B PBLs until
the number of 4KB PBLs falls below a threshold.  At this point the 4KB
PBL structure is converted to use 256B PBLs which prevents the driver
from running out of 4KB PBLs too quickly.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:15:00 -08:00
Faisal Latif 2869975cfb RDMA/nes: Remove LLTX
NETIF_F_LLTX is deprecated. Remove private TX locking from the driver
and remove the NETIF_F_LLTX feature flag.  This also fixes a warning
in some configs that comes from doing skb_linearize() call in the
hard_start_xmit method with IRQs disabled (if HIGHMEM is enabled,
skb_linearize() may end up enabling BHs, which is a no-no if hard IRQs
are disabled in that context).  By getting rid of LLTX, we do not
disable IRQs when skb_linearize() is called.

Remove the sq_lock as it is not needed for non-LLTX.  Fix ethtool not
to show the counter for sq_lock.

Reported-by: aluno3@poczta.onet.pl
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:11 -08:00
Don Wood fd87778cb9 RDMA/nes: Inform hardware that asynchronous event has been handled
When asynchronous events are processed by software, it is necessary
to let the hardware know that software has handled the event.  This
frees up the entry in the asynchronous event queue.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:11 -08:00
Chien Tung 7b14ab0b43 RDMA/nes: Fix tmp_addr compilation warning
In find_node(), tmp_addr causes an "unused variable" warning when
INFINIBAND_NES_DEBUG is not defined.  It's only used in a nes_debug()
and the print does not make sense.  So take out the whole thing.

Reported-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:11 -08:00
Chien Tung b9c367e7e6 RDMA/nes: Report correct vendor_id and vendor_part_id
ibv_devinfo displays 0 for vendor_id and vendor_part_id.  Fill in OUI
and device_id for those two fields.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:10 -08:00
Chien Tung cd6853d3eb RDMA/nes: Update copyright to new legal entity and year
Update copyright to the new legal entity, Intel-NE, Inc., an Intel
company.  Update copyright for the new year.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:10 -08:00
Don Wood dae5d13a7e RDMA/nes: Account for freed PBL after HW operation
Fix occurrences where the software PBL counts were changed before the
hardware was updated.  This bug allowed another thread to overallocate
the hardware resources.

Add proper PBL accounting in case nes_reg_mr() fails.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:09 -08:00
Roland Dreier e538052746 IB/ipath: Really run work in ipath_release_user_pages_on_close()
ipath_release_user_pages_on_close() just allocated a structure to
schedule work with but just returned (leaking the structure) rather than 
actually doing schedule_work().  Fix the logic to what was intended.

This was spotted by the Coverity checker (CID 2700).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-22 20:14:37 -08:00
Roland Dreier 71c4512201 IB/ipath: Fix memory leak in init_shadow_tids() error path
If the second vmalloc() fails, the wrong pointer is pased to vfree(), so
the first vmalloc() ends up getting leaked.

This was spotted by the Coverity checker (CID 2709).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-22 20:04:34 -08:00
Ingo Molnar 74019224ac timers: add mod_timer_pending()
Impact: new timer API

Based on an idea from Martin Josefsson with the help of
Patrick McHardy and Stephen Hemminger:

introduce the mod_timer_pending() API which is a mod_timer()
offspring that is an invariant on already removed timers.

(regular mod_timer() re-activates non-pending timers.)

This is useful for the networking code in that it can
allow unserialized mod_timer_pending() timer-forwarding
calls, but a single del_timer*() will stop the timer
from being reactivated again.

Also while at it:

- optimize the regular mod_timer() path some more, the
  timer-stat and a debug check was needlessly duplicated
  in __mod_timer().

- make the exports come straight after the function, as
  most other exports in timer.c already did.

- eliminate __mod_timer() as an external API, change the
  users to mod_timer().

The regular mod_timer() code path is not impacted
significantly, due to inlining optimizations and due to
the simplifications.

Based-on-patch-from: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-18 19:26:33 +01:00
Steve Wise 4263289630 RDMA/cxgb3: Remove modulo math from build_rdma_recv()
Remove modulo usage to avoid a divide in the fast path (not all
gcc versions do strength reduction here).

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-16 21:23:32 -08:00
Steve Wise 42fb61f02f RDMA/cxgb3: Connection termination fixes
The poll and flush code needs to handle all send opcodes: SEND,
SEND_WITH_SE, SEND_WITH_INV, and SEND_WITH_SE_INV.

Ignore TERM indications if the connection already gone.

Ignore HW receive completions if the RQ is empty.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-10 16:38:57 -08:00
Steve Wise 900f4c16c3 RDMA/cxgb3: sgl/pbl offset calculation needs 64 bits
The variable 'offset' in iwch_sgl2pbl_map() needs to be a u64.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-02-10 16:38:22 -08:00
Moni Shoua 270b8b8513 IB/mthca: Fix dispatch of IB_EVENT_LID_CHANGE event
When snooping a PortInfo MAD, its client_reregister bit is checked.
If the bit is ON then a CLIENT_REREGISTER event is dispatched,
otherwise a LID_CHANGE event is dispatched.  This way of decision
ignores the cases where the MAD changes the LID along with an
instruction to reregister (so a necessary LID_CHANGE event won't be
dispatched) or the MAD is neither of these (and an unnecessary
LID_CHANGE event will be dispatched).

This causes problems at least with IPoIB, which will do a "light"
flush on reregister, rather than the "heavy" flush required due to a
LID change.

Fix this by dispatching a CLIENT_REREGISTER event if the
client_reregister bit is set, but also compare the LID in the MAD to
the current LID.  If and only if they are not identical then a
LID_CHANGE event is dispatched.

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-28 15:15:56 -08:00
Moni Shoua f0f6f346a1 IB/mlx4: Fix dispatch of IB_EVENT_LID_CHANGE event
When snooping a PortInfo MAD, its client_reregister bit is checked.
If the bit is ON then a CLIENT_REREGISTER event is dispatched,
otherwise a LID_CHANGE event is dispatched.  This way of decision
ignores the cases where the MAD changes the LID along with an
instruction to reregister (so a necessary LID_CHANGE event won't be
dispatched) or the MAD is neither of these (and an unnecessary
LID_CHANGE event will be dispatched).

This causes problems at least with IPoIB, which will do a "light"
flush on reregister, rather than the "heavy" flush required due to a
LID change.

Fix this by dispatching a CLIENT_REREGISTER event if the
client_reregister bit is set, but also compare the LID in the MAD to
the current LID.  If and only if they are not identical then a
LID_CHANGE event is dispatched.

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-28 14:54:35 -08:00
Divy Le Ray a73efd0a85 iw_cxgb3: handle chip reset notifications
Freeze activity when notified that the underlying chip
is getting reset on a EEH event or fatal error.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-26 22:22:19 -08:00
Ben Hutchings 288379f050 net: Remove redundant NAPI functions
Following the removal of the unused struct net_device * parameter from
the NAPI functions named *netif_rx_* in commit 908a7a1, they are
exactly equivalent to the corresponding *napi_* functions and are
therefore redundant.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21 14:33:50 -08:00
Harvey Harrison 9c3da09917 IB: Remove __constant_{endian} uses
The base versions handle constant folding just fine, use them
directly.  The replacements are OK in the include/ files as they are
not exported to userspace so we don't need the __ prefixed versions.

This patch does not affect code generation at all.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-17 17:11:57 -08:00
Roland Dreier ac8581d408 Merge branches 'ehca', 'ipoib' and 'mlx4' into for-linus 2009-01-16 15:05:54 -08:00
Stephen Rothwell ee96aae573 IB/ehca: Use consistent types for ehca_plpar_hcall9()
ehca_plpar_hcall9() takes an unsigned long array, so make all callers
pass that in.  This fixes warnings introduced by commit fe333321
("powerpc: Change u64/s64 to a long long integer type"), which changed
u64 from unsigned long to unsigned long long.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-16 14:55:40 -08:00
Stephen Rothwell 3750f60557 IB/ehca: Fix printk format warnings from u64 type change
Commit fe333321 ("powerpc: Change u64/s64 to a long long integer
type") changed u64 from unsigned long to unsigned long long, which
means that printk formats for printing u64 values should use "ll"
instead of "l" to avoid warnings.  Fix all the places affected by this
in ehca.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-16 14:55:28 -08:00
Roland Dreier 0fd7e1d855 IB/mlx4: Fix memory ordering problem when posting LSO sends
The current work request posting code writes the LSO segment before
writing any data segments.  This leaves a window where the LSO segment
overwrites the stamping in one cacheline that the HCA prefetches
before the rest of the cacheline is filled with the correct data
segments.  When the HCA processes this work request, a local
protection error may result.

Fix this by saving the LSO header size field off and writing it only
after all data segments are written.  This fix is a cleaned-up version
of a patch from Jack Morgenstein <jackm@dev.mellanox.co.il>.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1383>.

Reported-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-16 12:47:47 -08:00
Linus Torvalds ccbf04f24c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/iser: Add dependency on INFINIBAND_ADDR_TRANS
  IPoIB: Do not join broadcast group if interface is brought down
  RDMA/nes: Fix for NIPQUAD removal
  IPoIB: Fix loss of connectivity after bonding failover on both sides
  IB/mlx4: Don't register IB device for adapters with no IB ports
  mlx4_core: Fix warning from min()
  IB/ehca: spin_lock_irqsave() takes an unsigned long
2009-01-13 08:19:42 -08:00
Roland Dreier 8c9ea7fe96 Merge branches 'ehca', 'ipoib', 'iser', 'mlx4' and 'nes' into for-next 2009-01-12 19:37:31 -08:00
Harvey Harrison 03080e5cbe RDMA/nes: Fix for NIPQUAD removal
Commit 63779436 ("drivers: replace NIPQUAD()") accidentally replaced
some HIPQUAD()s, causing IP addresses to be printed in reverse order.
Add temporary local vars until the byteswapping can be pushed further
up the stack.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-10 21:45:42 -08:00
Roland Dreier 22e7ef9c08 IB/mlx4: Don't register IB device for adapters with no IB ports
If the mlx4_ib driver finds an adapter that has only ethernet ports, the
current code will register an IB device with 0 ports.  Nothing useful or
sensible can be done with such a device, so just skip registering it.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-09 13:22:29 -08:00
Coly Li 73ac36ea14 fix similar typos to successfull
When I review ocfs2 code, find there are 2 typos to "successfull".  After
doing grep "successfull " in kernel tree, 22 typos found totally -- great
minds always think alike :)

This patch fixes all the similar typos. Thanks for Randy's ack and comments.

Signed-off-by: Coly Li <coyli@suse.de>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:15 -08:00
Stephen Rothwell 7ddccb234c IB/ehca: spin_lock_irqsave() takes an unsigned long
The flags argument to spin_lock_irqsave() should really be unsigned
long.  This will also help prevent some warnings when we change u64 to
unsigned long long.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-07 11:24:36 -08:00
Frederik Schwarzer 025dfdafe7 trivial: fix then -> than typos in comments and documentation
- (better, more, bigger ...) then -> (...) than

Signed-off-by: Frederik Schwarzer <schwarzerf@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-01-06 11:28:06 +01:00
Al Viro 56ff5efad9 zero i_uid/i_gid on inode allocation
... and don't bother in callers.  Don't bother with zeroing i_blocks,
while we are at it - it's already been zeroed.

i_mode is not worth the effort; it has no common default value.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-05 11:54:28 -05:00
Rusty Russell 2ca1a61583 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	arch/x86/kernel/io_apic.c
2008-12-31 23:05:57 +10:30
Roland Dreier f781a22fa2 IB/mlx4: Fix reading SL field out of cqe->sl_vid
Commit f780a9f1 ("mlx4_core: Add ethernet fields to CQE struct")
introduced a bug in how wc->sl is set in mlx4_ib_poll_one() -- since
cqe->sl_vid is a big-endian value, the shift must be done after
converting to host endianness.

This bug was found using sparse endianness checking.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-12-30 15:30:26 -08:00
Rusty Russell cbe31f02f5 cpumask: use new cpumask API in drivers/infiniband/hw/ipath
Impact: cleanup

We're moving from handing around cpumask_t's to handing around struct
cpumask *'s.  cpus_*, cpumask_t and cpu_*_map are deprecated: convert
to cpumask_*, cpu_*_mask.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ralph Campbell <infinipath@qlogic.com>
2008-12-30 09:05:18 +10:30