Commit Graph

2714 Commits (34a991587a5cc9f78960c2c9beea217866458c41)

Author SHA1 Message Date
Linus Torvalds 4c171acc20 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/cma: Save PID of ID's owner
  RDMA/cma: Add support for netlink statistics export
  RDMA/cma: Pass QP type into rdma_create_id()
  RDMA: Update exported headers list
  RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h>
  RDMA/nes: Add a check for strict_strtoul()
  RDMA/cxgb3: Don't post zero-byte read if endpoint is going away
  RDMA/cxgb4: Use completion objects for event blocking
  IB/srp: Fix integer -> pointer cast warnings
  IB: Add devnode methods to cm_class and umad_class
  IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required
  IB/uverbs: Add devnode method to set path/mode
  RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node
  RDMA: Add netlink infrastructure
  RDMA: Add error handling to ib_core_init()
2011-05-26 12:13:57 -07:00
Roland Dreier 8dc4abdf4c Merge branches 'cma', 'cxgb3', 'cxgb4', 'misc', 'nes', 'netlink', 'srp' and 'uverbs' into for-next 2011-05-25 13:47:20 -07:00
Nir Muchtar 83e9502d8d RDMA/cma: Save PID of ID's owner
Save the PID associated with an RDMA CM ID for reporting via netlink.

Signed-off-by: Nir Muchtar <nirm@voltaire.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25 13:46:23 -07:00
Nir Muchtar 753f618ae0 RDMA/cma: Add support for netlink statistics export
Add callbacks and data types for statistics export of all current
devices/ids.  The schema for RDMA CM is a series of netlink messages.
Each one contains an rdma_cm_stat struct.  Additionally, two netlink
attributes are created for the addresses for each message (if
applicable).

Their types used are:
RDMA_NL_RDMA_CM_ATTR_SRC_ADDR (The source address for this ID)
RDMA_NL_RDMA_CM_ATTR_DST_ADDR (The destination address for this ID)
sockaddr_* structs are encapsulated within these attributes.

In other words, every transaction contains a series of messages like:

-------message 1-------
struct rdma_cm_id_stats {
       __u32 qp_num;
       __u32 bound_dev_if;
       __u32 port_space;
       __s32 pid;
       __u8 cm_state;
       __u8 node_type;
       __u8 port_num;
       __u8 reserved;
}
RDMA_NL_RDMA_CM_ATTR_SRC_ADDR attribute - contains the source address
RDMA_NL_RDMA_CM_ATTR_DST_ADDR attribute - contains the destination address
-------end 1-------
-------message 2-------
struct rdma_cm_id_stats
RDMA_NL_RDMA_CM_ATTR_SRC_ADDR attribute
RDMA_NL_RDMA_CM_ATTR_DST_ADDR attribute
-------end 2-------

Signed-off-by: Nir Muchtar <nirm@voltaire.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25 13:46:23 -07:00
Sean Hefty b26f9b9949 RDMA/cma: Pass QP type into rdma_create_id()
The RDMA CM currently infers the QP type from the port space selected
by the user.  In the future (eg with RDMA_PS_IB or XRC), there may not
be a 1-1 correspondence between port space and QP type.  For netlink
export of RDMA CM state, we want to export the QP type to userspace,
so it is cleaner to explicitly associate a QP type to an ID.

Modify rdma_create_id() to allow the user to specify the QP type, and
use it to make our selections of datagram versus connected mode.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25 13:46:23 -07:00
Nir Muchtar 550e5ca77e RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h>
Move cma.c's internal definition of enum cma_state to enum rdma_cm_state
in an exported header so that it can be exported via RDMA netlink.

Signed-off-by: Nir Muchtar <nirm@voltaire.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25 13:46:22 -07:00
Liu Yuan 52f81dbaf1 RDMA/nes: Add a check for strict_strtoul()
It should check if strict_strtoul() succeeds before using
'wqm_quanta_value'.

Signed-off-by: Liu Yuan <tailai.ly@taobao.com>

[ Convert to kstrtoul() directly while we're here.  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-24 10:06:25 -07:00
Steve Wise 807838686e RDMA/cxgb3: Don't post zero-byte read if endpoint is going away
tx_ack() wasn't checking the endpoint state and consequently would
attempt to post the p2p 0B read on an endpoint/QP that is closing or
aborting.  This causes a NULL pointer dereference crash.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-24 10:01:04 -07:00
Steve Wise c337374bf2 RDMA/cxgb4: Use completion objects for event blocking
There exists a race condition when using wait_queue_head_t objects
that are declared on the stack.  This was being done in a few places
where we are sending work requests to the FW and awaiting replies, but
we don't have an endpoint structure with an embedded c4iw_wr_wait
struct.  So the code was allocating it locally on the stack.  Bad
design.  The race is:

  1) thread on cpuX declares the wait_queue_head_t on the stack, then
     posts a firmware WR with that wait object ptr as the cookie to be
     returned in the WR reply.  This thread will proceed to block in
     wait_event_timeout() but before it does:

  2) An interrupt runs on cpuY with the WR reply.  fw6_msg() handles
     this and calls c4iw_wake_up().  c4iw_wake_up() sets the condition
     variable in the c4iw_wr_wait object to TRUE and will call
     wake_up(), but before it calls wake_up():

  3) The thread on cpuX calls c4iw_wait_for_reply(), which calls
     wait_event_timeout().  The wait_event_timeout() macro checks the
     condition variable and returns immediately since it is TRUE.  So
     this thread never blocks/sleeps. The function then returns
     effectively deallocating the c4iw_wr_wait object that was on the
     stack.

  4) So at this point cpuY has a pointer to the c4iw_wr_wait object
     that is no longer valid.  Further its pointing to a stack frame
     that might now be in use by some other context/thread.  So cpuY
     continues execution and calls wake_up() on a ptr to a wait object
     that as been effectively deallocated.

This race, when it hits, can cause a crash in wake_up(), which I've
seen under heavy stress. It can also corrupt the referenced stack
which can cause any number of failures.

The fix:

Use struct completion, which supports on-stack declarations.
Completions use a spinlock around setting the condition to true and
the wake up so that steps 2 and 4 above are atomic and step 3 can
never happen in-between.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
2011-05-24 09:47:38 -07:00
Roland Dreier 737b94eb41 IB/srp: Fix integer -> pointer cast warnings
Fix

    drivers/infiniband/ulp/srp/ib_srp.c: In function 'srp_handle_recv':
    drivers/infiniband/ulp/srp/ib_srp.c:1150: warning: cast to pointer from integer of different size
    drivers/infiniband/ulp/srp/ib_srp.c: In function 'srp_send_completion':
    drivers/infiniband/ulp/srp/ib_srp.c🔢 warning: cast to pointer from integer of different size

by adding an intermediate cast to uintptr_t.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: David Dillow <dillowda@ornl.gov>
2011-05-23 11:30:04 -07:00
Roland Dreier c3af0980ce IB: Add devnode methods to cm_class and umad_class
We want the ucmX, umadX and issmX device nodes to show up under
/dev/infiniband, and additionally ucmX should have mode 0666.  Add
appropriate devnode methods to their class structs for this.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-23 11:24:28 -07:00
Ira Weiny c8367c4cd9 IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required
We had a script which was looping through the devices returned from
ibstat and attempted to register a SMI agent on an ethernet device.
This caused a kernel panic for IBoE devices that don't have QP0.

Fix this by checking if the QP exists before using it.

Signed-off-by: Ira Weiny <weiny2@llnl.gov>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-23 11:15:48 -07:00
Roland Dreier 71c29bd5c2 IB/uverbs: Add devnode method to set path/mode
We want udev to create a device node under /dev/infiniband with
permission 0666 for uverbsX devices, so add a devnode method to set the
appropriate info.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-23 11:10:05 -07:00
Roland Dreier 04ea2f8197 RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node
We want udev to create a device node under /dev/infiniband with
permission 0666 for rdma_cm, so add that info to our struct miscdevice.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
2011-05-23 10:48:43 -07:00
Paul Gortmaker 70c7160619 Add appropriate <linux/prefetch.h> include for prefetch users
After discovering that wide use of prefetch on modern CPUs
could be a net loss instead of a win, net drivers which were
relying on the implicit inclusion of prefetch.h via the list
headers showed up in the resulting cleanup fallout.  Give
them an explicit include via the following $0.02 script.

 =========================================
 #!/bin/bash
 MANUAL=""
 for i in `git grep -l 'prefetch(.*)' .` ; do
 	grep -q '<linux/prefetch.h>' $i
 	if [ $? = 0 ] ; then
 		continue
 	fi

 	(	echo '?^#include <linux/?a'
 		echo '#include <linux/prefetch.h>'
 		echo .
 		echo w
 		echo q
 	) | ed -s $i > /dev/null 2>&1
 	if [ $? != 0 ]; then
 		echo $i needs manual fixup
 		MANUAL="$i $MANUAL"
 	fi
 done
 echo ------------------- 8\<----------------------
 echo vi $MANUAL
 =========================================

Signed-off-by: Paul <paul.gortmaker@windriver.com>
[ Fixed up some incorrect #include placements, and added some
  non-network drivers and the fib_trie.c case    - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-22 21:41:57 -07:00
Linus Torvalds 06f4e926d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits)
  macvlan: fix panic if lowerdev in a bond
  tg3: Add braces around 5906 workaround.
  tg3: Fix NETIF_F_LOOPBACK error
  macvlan: remove one synchronize_rcu() call
  networking: NET_CLS_ROUTE4 depends on INET
  irda: Fix error propagation in ircomm_lmp_connect_response()
  irda: Kill set but unused variable 'bytes' in irlan_check_command_param()
  irda: Kill set but unused variable 'clen' in ircomm_connect_indication()
  rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport()
  be2net: Kill set but unused variable 'req' in lancer_fw_download()
  irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication()
  atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined.
  rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer().
  rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler()
  rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection()
  rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window()
  pkt_sched: Kill set but unused variable 'protocol' in tc_classify()
  isdn: capi: Use pr_debug() instead of ifdefs.
  tg3: Update version to 3.119
  tg3: Apply rx_discards fix to 5719/5720
  ...

Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c
as per Davem.
2011-05-20 13:43:21 -07:00
Roland Dreier b2cbae2c24 RDMA: Add netlink infrastructure
Add basic RDMA netlink infrastructure that allows for registration of
RDMA clients for which data is to be exported and supplies message
construction callbacks.

Signed-off-by: Nir Muchtar <nirm@voltaire.com>

[ Reorganize a few things, add CONFIG_NET dependency.  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-20 11:46:11 -07:00
Nir Muchtar fd75c789ab RDMA: Add error handling to ib_core_init()
Fail RDMA midlayer initialization if sysfs setup fails.

Signed-off-by: Nir Muchtar <nirm@voltaire.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-20 11:46:10 -07:00
Benjamin Herrenschmidt 880102e785 Merge remote branch 'origin/master' into merge
Manual merge of arch/powerpc/kernel/smp.c and add missing scheduler_ipi()
call to arch/powerpc/platforms/cell/interrupt.c

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-20 15:36:52 +10:00
Benjamin Herrenschmidt 4c8440666b Merge branch 'merge' into next 2011-05-19 17:00:06 +10:00
Roland Dreier 1df9fad122 Merge branches 'cma', 'cxgb4' and 'qib' into for-next 2011-05-12 08:57:20 -07:00
Sergei Shtylyov 1c65335714 IB/qib: Use pci_dev->revision
The driver reads PCI revision ID from the PCI configuration register
while it's already stored by PCI subsystem in the revision field of
struct pci_dev.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-12 08:57:12 -07:00
David S. Miller 5fc3590c81 infiniband: Remove rt->rt_src usage in addr4_resolve()
Use an explicit flow key and fetch it from there.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-10 13:32:48 -07:00
Roland Dreier d0c49bf391 RDMA/iwcm: Get rid of enum iw_cm_event_status
The IW_CM_EVENT_STATUS_xxx values were used in only a couple of places;
cma.c uses -Exxx values instead, and so do the amso1100, cxgb3 and cxgb4
drivers -- only nes was using the enum values (with the mild consequence
that all nes connection failures were treated as generic errors rather
than reported as timeouts or rejections).

We can fix this confusion by getting rid of enum iw_cm_event_status and
using a plain int for struct iw_cm_event.status, and converting nes to
use -Exxx as the other iWARP drivers do.

This also gets rid of the warning

    drivers/infiniband/core/cma.c: In function 'cma_iw_handler':
    drivers/infiniband/core/cma.c:1333:3: warning: case value '4294967185' not in enumerated type 'enum iw_cm_event_status'
    drivers/infiniband/core/cma.c:1336:3: warning: case value '4294967186' not in enumerated type 'enum iw_cm_event_status'
    drivers/infiniband/core/cma.c:1332:3: warning: case value '4294967192' not in enumerated type 'enum iw_cm_event_status'

Signed-off-by: Roland Dreier <roland@purestorage.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Faisal Latif <faisal.latif@intel.com>
2011-05-09 22:23:57 -07:00
Sergei Shtylyov ec03d6777a IB/ipath: Use pci_dev->revision, again
Commit 44c10138fd ("PCI: Change all drivers to use pci_device->revision")
already converted this driver to using the revision field of struct
pci_dev but commit bb9171448d ("IB/ipath: Misc changes to prepare
for IB7220 introduction") later reverted that change for some strange
reason.  Restore the change.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:07:31 -07:00
Mitko Haralanov 9f5754e34b IB/qib: Prevent driver hang with unprogrammed boards
The time limit test now correctly checks against current jiffies to
avoid the hang.

Signed-off-by: Mitko Haralanov <mitko@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:07:31 -07:00
Steve Wise 2f25e9a540 RDMA/cxgb4: EEH errors can hang the driver
A few more EEH fixes:

c4iw_wait_for_reply(): detect fatal EEH condition on timeout and
return an error.

The iw_cxgb4 driver was only calling ib_deregister_device() on an EEH
event followed by a ib_register_device() when the device was
reinitialized.  However, the RDMA core doesn't allow multiple
iterations of register/deregister by the provider. See
drivers/infiniband/core/sysfs.c: ib_device_unregister_sysfs() where
the kobject ref is held until the device is deallocated in
ib_deallocate_device().  Calling deregister adds this kobj reference,
and then a subsequent register call will generate a WARN_ON() from the
kobject subsystem because the kobject is being initialized but is
already initialized with the ref held.

So the provider must deregister and dealloc when resetting for an EEH
event, then alloc/register to re-initialize.  To do this, we cannot
use the device ptr as our ULD handle since it will change with each
reallocation.  This commit adds a ULD context struct which is used as
the ULD handle, and then contains the device pointer and other state
needed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:23 -07:00
Steve Wise d9594d990a RDMA/cxgb4: Reset wait condition atomically
The driver was never really waiting for RDMA_WR/FINI completions
because the condition variable used to determine if the completion
happened was never reset, and this condition variable is reused for
both connection setup and teardown.  This causes various driver
crashes under heavy loads due to releasing resources too early.

The fix is to use atomic bits to correctly reset the condition
immediately after the completion is detected.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
Roel Kluin 85d215b0f3 RDMA/cxgb4: Fix missing parentheses
Parens are missing: '|' has a higher presedence than '?'.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
Steve Wise bbe9a0a2bc RDMA/cxgb4: Initialization errors can cause crash
c4iw_uld_add() must return ERR_PTR() values instead of NULL on failure.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
Steve Wise 30c95c2d49 RDMA/cxgb4: Don't change QP state outside EP lock
Concurrent ingress CLOSE and ULP ABORT operations causes a crash due
to a race condition where the close path releases the EP lock and then
tries to move the QP state to CLOSED.  This must be done inside the EP
lock to avoid the race.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
Hefty, Sean a9bb79128a RDMA/cma: Add an ID_REUSEADDR option
Lustre requires that clients bind to a privileged port number before
connecting to a remote server.  On larger clusters (typically more
than about 1000 nodes), the number of privileged ports is exhausted,
resulting in lustre being unusable.

To handle this, we add support for reusable addresses to the rdma_cm.
This mimics the behavior of the socket option SO_REUSEADDR.  A user
may set an rdma_cm_id to reuse an address before calling
rdma_bind_addr() (explicitly or implicitly).  If set, other
rdma_cm_id's may be bound to the same address, provided that they all
have reuse enabled, and there are no active listens.

If rdma_listen() is called on an rdma_cm_id that has reuse enabled, it
will only succeed if there are no other id's bound to that same
address.  The reuse option is exported to user space.  The behavior of
the kernel reuse implementation was verified against that given by
sockets.

This patch is derived from a path by Ira Weiny <weiny2@llnl.gov>

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:10 -07:00
Hefty, Sean 43b752daae RDMA/cma: Fix handling of IPv6 addressing in cma_use_port
cma_use_port() assumes that the sockaddr is an IPv4 address.  Since
IPv6 addressing is supported (and also to support other address
families) make the code more generic in its address handling.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:10 -07:00
David S. Miller 31e4543db2 ipv4: Make caller provide on-stack flow key to ip_route_output_ports().
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-03 20:25:42 -07:00
David Decotigny 7073949720 ethtool: cosmetic: Use ethtool ethtool_cmd_speed API
This updates the network drivers so that they don't access the
ethtool_cmd::speed field directly, but use ethtool_cmd_speed()
instead.

For most of the drivers, these changes are purely cosmetic and don't
fix any problem, such as for those 1GbE/10GbE drivers that indirectly
call their own ethtool get_settings()/mii_ethtool_gset(). The changes
are meant to enforce code consistency and provide robustness with
future larger throughputs, at the expense of a few CPU cycles for each
ethtool operation.

All drivers compiled with make allyesconfig ion x86_64 have been
updated.

Tested: make allyesconfig on x86_64 + e1000e/bnx2x work
Signed-off-by: David Decotigny <decot@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-29 14:03:01 -07:00
Lucas De Marchi e9c549998d Revert wrong fixes for common misspellings
These changes were incorrectly fixed by codespell. They were now
manually corrected.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-04-26 23:31:11 -07:00
Nishanth Aravamudan e297d9dd5c cxgb4: use pgprot_writecombine() on powerpc
Commit fe3cc0d99d ("powerpc: Add
pgprot_writecombine") in benh's tree exposes the pgprot_writecombine()
API to drivers on powerpc. cxgb4 has an open-coded version of the same,
so use the common API now that it's available.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Anton Blanchard <anton@samba.org>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:25 +10:00
Michał Mirosław 3d96c74d89 net: infiniband/ulp/ipoib: convert to hw_features
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-20 01:30:42 -07:00
Michał Mirosław dd6f6d0249 net: infiniband/hw/nes: convert to hw_features
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-20 01:30:41 -07:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Linus Torvalds dc50eddb2f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/nes: Fix test of uninitialized netdev
2011-03-25 21:06:37 -07:00
Linus Torvalds 00a2470546 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (56 commits)
  route: Take the right src and dst addresses in ip_route_newports
  ipv4: Fix nexthop caching wrt. scoping.
  ipv4: Invalidate nexthop cache nh_saddr more correctly.
  net: fix pch_gbe section mismatch warning
  ipv4: fix fib metrics
  mlx4_en: Removing HW info from ethtool -i report.
  net_sched: fix THROTTLED/RUNNING race
  drivers/net/a2065.c: Convert release_resource to release_region/release_mem_region
  drivers/net/ariadne.c: Convert release_resource to release_region/release_mem_region
  bonding: fix rx_handler locking
  myri10ge: fix rmmod crash
  mlx4_en: updated driver version to 1.5.4.1
  mlx4_en: Using blue flame support
  mlx4_core: reserve UARs for userspace consumers
  mlx4_core: maintain available field in bitmap allocator
  mlx4: Add blue flame support for kernel consumers
  mlx4_en: Enabling new steering
  mlx4: Add support for promiscuous mode in the new steering model.
  mlx4: generalization of multicast steering.
  mlx4_en: Reporting HW revision in ethtool -i
  ...
2011-03-25 21:02:22 -07:00
Roland Dreier cf55bb2439 RDMA/nes: Fix test of uninitialized netdev
Commit 1765a57533 ("net: make dev->master general") introduced a
test of an uninitialized netdev.  Fix the code so the intended netdev
is tested.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-24 17:18:30 -07:00
Linus Torvalds 0625bef606 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB: Increase DMA max_segment_size on Mellanox hardware
  IB/mad: Improve an error message so error code is included
  RDMA/nes: Don't print success message at level KERN_ERR
  RDMA/addr: Fix return of uninitialized ret value
  IB/srp: try to use larger FMR sizes to cover our mappings
  IB/srp: add support for indirect tables that don't fit in SRP_CMD
  IB/srp: rework mapping engine to use multiple FMR entries
  IB/srp: allow sg_tablesize to be set for each target
  IB/srp: move IB CM setup completion into its own function
  IB/srp: always avoid non-zero offsets into an FMR
2011-03-24 07:59:46 -07:00
Yevgeny Petrilin 0345584e0b mlx4: generalization of multicast steering.
The same packet steering mechanism would be used both for IB and Ethernet,
Both multicasts and unicasts.
This commit prepares the general infrastructure for this.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23 12:24:21 -07:00
Yevgeny Petrilin 725c89997e mlx4_en: Reporting HW revision in ethtool -i
HW revision is derived from device ID and rev id.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23 12:24:20 -07:00
Roland Dreier ba82638247 Merge branches 'misc', 'nes' and 'srp' into for-next 2011-03-23 11:13:37 -07:00
David Dillow 7f9e5c48c1 IB: Increase DMA max_segment_size on Mellanox hardware
By default, each device is assumed to be able only handle 64 KB chunks
during DMA. By giving the segment size a larger value, the block layer
will coalesce more S/G entries together for SRP, allowing larger
requests with the same sg_tablesize setting.  The block layer is the
only direct user of it, though a few IOMMU drivers reference it as
well for their *_map_sg coalescing code. pci-gart_64 on x86, and a
smattering on on sparc, powerpc, and ia64.

Since other IB protocols could potentially see larger segments with
this, let's check those:

 - iSER is fine, because you limit your maximum request size to 512
   KB, so we'll never overrun the page vector in struct iser_page_vec
   (128 entries currently). It is independent of the DMA segment size,
   and handles multi-page segments already.

 - IPoIB is fine, as it maps each page individually, and doesn't use
   ib_dma_map_sg().

 - RDS appears to do the right thing and has no dependencies on DMA
   segment size, but I don't claim to have done a complete audit.

 - NFSoRDMA and 9p are OK -- they do not use ib_dma_map_sg(), so they
   doesn't care about the coalescing.

 - Lustre's ko2iblnd does not care about coalescing -- it properly
   walks the returned sg list.

This patch ups the value on Mellanox hardware to 1 GB, which matches
reported firmware limits on mlx4.

Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-22 09:39:18 -07:00
Roland Dreier 071c778347 Merge branch 'external-indirect' of git://git.kernel.org/pub/scm/linux/kernel/git/dad/srp-initiator into srp 2011-03-21 11:52:00 -07:00
Michael Heinz 1eba843dd7 IB/mad: Improve an error message so error code is included
Signed-off-by: Michael Heinz <michael.heinz@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-18 09:42:20 -07:00