Commit Graph

138 Commits (d2ff4fc557a4c5248b2d99b0d48e47a246d994b2)

Author SHA1 Message Date
Linus Torvalds 4c171acc20 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/cma: Save PID of ID's owner
  RDMA/cma: Add support for netlink statistics export
  RDMA/cma: Pass QP type into rdma_create_id()
  RDMA: Update exported headers list
  RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h>
  RDMA/nes: Add a check for strict_strtoul()
  RDMA/cxgb3: Don't post zero-byte read if endpoint is going away
  RDMA/cxgb4: Use completion objects for event blocking
  IB/srp: Fix integer -> pointer cast warnings
  IB: Add devnode methods to cm_class and umad_class
  IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required
  IB/uverbs: Add devnode method to set path/mode
  RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node
  RDMA: Add netlink infrastructure
  RDMA: Add error handling to ib_core_init()
2011-05-26 12:13:57 -07:00
Steve Wise c337374bf2 RDMA/cxgb4: Use completion objects for event blocking
There exists a race condition when using wait_queue_head_t objects
that are declared on the stack.  This was being done in a few places
where we are sending work requests to the FW and awaiting replies, but
we don't have an endpoint structure with an embedded c4iw_wr_wait
struct.  So the code was allocating it locally on the stack.  Bad
design.  The race is:

  1) thread on cpuX declares the wait_queue_head_t on the stack, then
     posts a firmware WR with that wait object ptr as the cookie to be
     returned in the WR reply.  This thread will proceed to block in
     wait_event_timeout() but before it does:

  2) An interrupt runs on cpuY with the WR reply.  fw6_msg() handles
     this and calls c4iw_wake_up().  c4iw_wake_up() sets the condition
     variable in the c4iw_wr_wait object to TRUE and will call
     wake_up(), but before it calls wake_up():

  3) The thread on cpuX calls c4iw_wait_for_reply(), which calls
     wait_event_timeout().  The wait_event_timeout() macro checks the
     condition variable and returns immediately since it is TRUE.  So
     this thread never blocks/sleeps. The function then returns
     effectively deallocating the c4iw_wr_wait object that was on the
     stack.

  4) So at this point cpuY has a pointer to the c4iw_wr_wait object
     that is no longer valid.  Further its pointing to a stack frame
     that might now be in use by some other context/thread.  So cpuY
     continues execution and calls wake_up() on a ptr to a wait object
     that as been effectively deallocated.

This race, when it hits, can cause a crash in wake_up(), which I've
seen under heavy stress. It can also corrupt the referenced stack
which can cause any number of failures.

The fix:

Use struct completion, which supports on-stack declarations.
Completions use a spinlock around setting the condition to true and
the wake up so that steps 2 and 4 above are atomic and step 3 can
never happen in-between.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
2011-05-24 09:47:38 -07:00
Linus Torvalds 06f4e926d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits)
  macvlan: fix panic if lowerdev in a bond
  tg3: Add braces around 5906 workaround.
  tg3: Fix NETIF_F_LOOPBACK error
  macvlan: remove one synchronize_rcu() call
  networking: NET_CLS_ROUTE4 depends on INET
  irda: Fix error propagation in ircomm_lmp_connect_response()
  irda: Kill set but unused variable 'bytes' in irlan_check_command_param()
  irda: Kill set but unused variable 'clen' in ircomm_connect_indication()
  rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport()
  be2net: Kill set but unused variable 'req' in lancer_fw_download()
  irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication()
  atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined.
  rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer().
  rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler()
  rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection()
  rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window()
  pkt_sched: Kill set but unused variable 'protocol' in tc_classify()
  isdn: capi: Use pr_debug() instead of ifdefs.
  tg3: Update version to 3.119
  tg3: Apply rx_discards fix to 5719/5720
  ...

Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c
as per Davem.
2011-05-20 13:43:21 -07:00
Benjamin Herrenschmidt 880102e785 Merge remote branch 'origin/master' into merge
Manual merge of arch/powerpc/kernel/smp.c and add missing scheduler_ipi()
call to arch/powerpc/platforms/cell/interrupt.c

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-20 15:36:52 +10:00
Steve Wise 2f25e9a540 RDMA/cxgb4: EEH errors can hang the driver
A few more EEH fixes:

c4iw_wait_for_reply(): detect fatal EEH condition on timeout and
return an error.

The iw_cxgb4 driver was only calling ib_deregister_device() on an EEH
event followed by a ib_register_device() when the device was
reinitialized.  However, the RDMA core doesn't allow multiple
iterations of register/deregister by the provider. See
drivers/infiniband/core/sysfs.c: ib_device_unregister_sysfs() where
the kobject ref is held until the device is deallocated in
ib_deallocate_device().  Calling deregister adds this kobj reference,
and then a subsequent register call will generate a WARN_ON() from the
kobject subsystem because the kobject is being initialized but is
already initialized with the ref held.

So the provider must deregister and dealloc when resetting for an EEH
event, then alloc/register to re-initialize.  To do this, we cannot
use the device ptr as our ULD handle since it will change with each
reallocation.  This commit adds a ULD context struct which is used as
the ULD handle, and then contains the device pointer and other state
needed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:23 -07:00
Steve Wise d9594d990a RDMA/cxgb4: Reset wait condition atomically
The driver was never really waiting for RDMA_WR/FINI completions
because the condition variable used to determine if the completion
happened was never reset, and this condition variable is reused for
both connection setup and teardown.  This causes various driver
crashes under heavy loads due to releasing resources too early.

The fix is to use atomic bits to correctly reset the condition
immediately after the completion is detected.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
Roel Kluin 85d215b0f3 RDMA/cxgb4: Fix missing parentheses
Parens are missing: '|' has a higher presedence than '?'.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
Steve Wise bbe9a0a2bc RDMA/cxgb4: Initialization errors can cause crash
c4iw_uld_add() must return ERR_PTR() values instead of NULL on failure.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
Steve Wise 30c95c2d49 RDMA/cxgb4: Don't change QP state outside EP lock
Concurrent ingress CLOSE and ULP ABORT operations causes a crash due
to a race condition where the close path releases the EP lock and then
tries to move the QP state to CLOSED.  This must be done inside the EP
lock to avoid the race.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
David S. Miller 31e4543db2 ipv4: Make caller provide on-stack flow key to ip_route_output_ports().
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-03 20:25:42 -07:00
Nishanth Aravamudan e297d9dd5c cxgb4: use pgprot_writecombine() on powerpc
Commit fe3cc0d99d ("powerpc: Add
pgprot_writecombine") in benh's tree exposes the pgprot_writecombine()
API to drivers on powerpc. cxgb4 has an open-coded version of the same,
so use the common API now that it's available.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Anton Blanchard <anton@samba.org>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:25 +10:00
Linus Torvalds 7a6362800c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits)
  bonding: enable netpoll without checking link status
  xfrm: Refcount destination entry on xfrm_lookup
  net: introduce rx_handler results and logic around that
  bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
  bonding: wrap slave state work
  net: get rid of multiple bond-related netdevice->priv_flags
  bonding: register slave pointer for rx_handler
  be2net: Bump up the version number
  be2net: Copyright notice change. Update to Emulex instead of ServerEngines
  e1000e: fix kconfig for crc32 dependency
  netfilter ebtables: fix xt_AUDIT to work with ebtables
  xen network backend driver
  bonding: Improve syslog message at device creation time
  bonding: Call netif_carrier_off after register_netdevice
  bonding: Incorrect TX queue offset
  net_sched: fix ip_tos2prio
  xfrm: fix __xfrm_route_forward()
  be2net: Fix UDP packet detected status in RX compl
  Phonet: fix aligned-mode pipe socket buffer header reserve
  netxen: support for GbE port settings
  ...

Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
with the staging updates.
2011-03-16 16:29:25 -07:00
Steve Wise db5d040d7b RDMA/cxgb4: Debugfs dump_qp() updates
- Show whether the SQ is in onchip memory or not.
- Dump both SQ and RQ QIDs.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:14 -07:00
Steve Wise 767fbe8151 RDMA/cxgb4: Dispatch FATAL event on EEH errors
This at least kicks the user mode applications that are watching for
device events.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:13 -07:00
Steve Wise b48f3b9c10 RDMA/cxgb4: Use ULP_MODE_TCPDDP
Set the ULP mode for initial RDMA connection setup to the proper DDP
mode.  This avoids wasting some HW resources while in streaming mode.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:12 -07:00
Steve Wise a9c7719800 RDMA/cxgb4: Enable on-chip SQ support by default
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:12 -07:00
Steve Wise ffc3f7487f RDMA/cxgb4: Do CIDX_INC updates every 1/16 CQ depth CQE reaps
This avoids the CIDX_INC overflow issue with T4A2 when running
kernel RDMA applications.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:11 -07:00
Steve Wise 2942813739 RDMA/cxgb4: Remove db_drop_task
Unloading iw_cxgb4 can crash due to the unload code trying to use
db_drop_task, which is uninitialized.  So remove this dead code.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:10 -07:00
Steve Wise b52fe09e33 RDMA/cxgb4: Turn on delayed ACK
Set the default to on.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:09 -07:00
David S. Miller 78fbfd8a65 ipv4: Create and use route lookup helpers.
The idea here is this minimizes the number of places one has to edit
in order to make changes to how flows are defined and used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:42 -08:00
David S. Miller b23dd4fe42 ipv4: Make output route lookup return rtable directly.
Instead of on the stack.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-02 14:31:35 -08:00
David S. Miller 273447b352 ipv4: Kill can_sleep arg to ip_route_output_flow()
This boolean state is now available in the flow flags.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:27:04 -08:00
David S. Miller 420d44daa7 ipv4: Make final arg to ip_route_output_flow to be boolean "can_sleep"
Since that is what the current vague "flags" argument means.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:19:23 -08:00
Steve Wise 94788657c9 RDMA/cxgb4: Set the correct device physical function for iWARP connections
The PF passed to FW was 0, causing PCI failures in an SR-IOV environment.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-01-28 15:34:28 -08:00
Steve Wise 6a09a9d694 RDMA/cxgb4: Limit MAXBURST EQ context field to 256B
MAXBURST cannot exceed 256B for on-chip queues.  With a 512B MAXBURST,
we can lock up the chip.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-01-28 15:34:24 -08:00
Linus Torvalds 008d23e485 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  Documentation/trace/events.txt: Remove obsolete sched_signal_send.
  writeback: fix global_dirty_limits comment runtime -> real-time
  ppc: fix comment typo singal -> signal
  drivers: fix comment typo diable -> disable.
  m68k: fix comment typo diable -> disable.
  wireless: comment typo fix diable -> disable.
  media: comment typo fix diable -> disable.
  remove doc for obsolete dynamic-printk kernel-parameter
  remove extraneous 'is' from Documentation/iostats.txt
  Fix spelling milisec -> ms in snd_ps3 module parameter description
  Fix spelling mistakes in comments
  Revert conflicting V4L changes
  i7core_edac: fix typos in comments
  mm/rmap.c: fix comment
  sound, ca0106: Fix assignment to 'channel'.
  hrtimer: fix a typo in comment
  init/Kconfig: fix typo
  anon_inodes: fix wrong function name in comment
  fix comment typos concerning "consistent"
  poll: fix a typo in comment
  ...

Fix up trivial conflicts in:
 - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
 - fs/ext4/ext4.h

Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.
2011-01-13 10:05:56 -08:00
Steve Wise db8b101671 RDMA/cxgb4: Don't re-init wait object in init/fini paths
Re-initializing the wait object in rdma_init()/rdma_fini() causes a
timing window which can lead to a deadlock during close.  Once this
deadlock hits, all RDMA activity over the T4 device will be stuck.

There's no need to re-init the wait object, so remove it.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-10 17:41:43 -08:00
Stephen Hemminger c943109163 RDMA/cxgb3,cxgb4: Remove dead code
This removes unused code found by running 'make namespacecheck';
compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-10 17:41:43 -08:00
Jesper Juhl e987fa357a infiniband: Only include mutex.h once in drivers/infiniband/hw/cxgb4/iw_cxgb4.h
Only include the header linux/mutex.h once inside
drivers/infiniband/hw/cxgb4/iw_cxgb4.h

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-11-15 14:35:19 +01:00
Linus Torvalds 9e5fca251f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (63 commits)
  IB/qib: clean up properly if pci_set_consistent_dma_mask() fails
  IB/qib: Allow driver to load if PCIe AER fails
  IB/qib: Fix uninitialized pointer if CONFIG_PCI_MSI not set
  IB/qib: Fix extra log level in qib_early_err()
  RDMA/cxgb4: Remove unnecessary KERN_<level> use
  RDMA/cxgb3: Remove unnecessary KERN_<level> use
  IB/core: Add link layer type information to sysfs
  IB/mlx4: Add VLAN support for IBoE
  IB/core: Add VLAN support for IBoE
  IB/mlx4: Add support for IBoE
  mlx4_en: Change multicast promiscuous mode to support IBoE
  mlx4_core: Update data structures and constants for IBoE
  mlx4_core: Allow protocol drivers to find corresponding interfaces
  IB/uverbs: Return link layer type to userspace for query port operation
  IB/srp: Sync buffer before posting send
  IB/srp: Use list_first_entry()
  IB/srp: Reduce number of BUSY conditions
  IB/srp: Eliminate two forward declarations
  IB/mlx4: Signal node desc changes to SM by using FW to generate trap 144
  IB: Replace EXTRA_CFLAGS with ccflags-y
  ...
2010-10-26 17:54:22 -07:00
Roland Dreier 116e9535fe Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iboe', 'ipoib', 'misc', 'mlx4', 'nes', 'qib' and 'srp' into for-next 2010-10-26 16:09:11 -07:00
Joe Perches aa1ad26089 RDMA/cxgb4: Remove unnecessary KERN_<level> use
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-26 13:45:59 -07:00
matt mooney 7454159d3c IB: Replace EXTRA_CFLAGS with ccflags-y
Signed-off-by: matt mooney <mfm@muteddisk.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-23 13:45:03 -07:00
Steve Wise da411ba1da RDMA/cxgb4: Use cxgb4 service for packet gl to skb
Remove the local service t4_pktgl_to_skb() and use cxgb4_pktgl_to_skb()
exported by cxgb4.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-22 21:58:50 -07:00
Steve Wise de5dd81b49 RDMA/cxgb4: Export T4 TCP MIB
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-22 21:57:26 -07:00
Justin P. Mattock 631dd1a885 Update broken web addresses in the kernel.
The patch below updates broken web addresses in the kernel

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: Ben Pfaff <blp@cs.stanford.edu>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Reviewed-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-10-18 11:03:14 +02:00
Steve Wise 3160977a6e RDMA/cxgb4: Use simple_read_from_buffer() for debugfs handlers
We can replace our equivalent open-coded version.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-11 20:15:14 -07:00
Steve Wise 8bbac892fb RDMA/cxgb4: Add default_llseek to debugfs files
Incorporate BKL removal changes.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-11 20:14:00 -07:00
Steve Wise 40dbf6ee38 RDMA/cxgb4: Fastreg NSMR fixes
- Remove dsgl support - doesn't work in T4.
- Wrap the immediate PBL as needed when building it in the wr.
- Adjust max pbl depth allowed based on ulptx alignment requirements.
- Bump the slots per SQ to 5 to allow up to 128MB fast registers.
- Advertise fastreg support by default.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:53:50 -07:00
Steve Wise 410ade4c26 RDMA/cxgb4: Don't set completion flag for read requests
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:53:49 -07:00
Steve Wise 98ae68b7ee RDMA/cxgb4: Set the default TCP send window to 128KB
This helps with large IO throughput.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:53:48 -07:00
Steve Wise 2f5b48c3ad RDMA/cxgb4: Use a mutex for QP and EP state transitions
Move the connection setup/teardown paths to the workq thread removing
spin lock/irq disable requirements for these paths.  This allows calls
down to the LLD for EP and QP state transition actions to be atomic
with respect to processing CPL messages coming up from the HW.
Namely, calls to rdma_init() and rdma_fini() can now be called with
the mutex held avoiding many race conditions with the abort path.

The QP spinlock is still used but only to manipulate the qp state.  This
allows the fastpaths, poll, post_send, and pos_recv, to run in the
irq context.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:53:48 -07:00
Steve Wise c6d7b26791 RDMA/cxgb4: Support on-chip SQs
T4 support on-chip SQs to reduce latency.  This patch adds support for
this in iw_cxgb4:

 - Manage ocqp memory like other adapter mem resources.
 - Allocate user mode SQs from ocqp mem if available.
 - Map ocqp mem to user process using write combining.
 - Map PCIE_MA_SYNC reg to user process.

Bump uverbs ABI.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:35 -07:00
Steve Wise aadc4df308 RDMA/cxgb4: Centralize the wait logic
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:34 -07:00
Steve Wise 9e8d1fa342 RDMA/cxgb4: debugfs files for dumping active stags
Add "stags" debugfs file.  This is useful for examining the TPTE and
PBL entries in adapter memory.  It allows scripts to dump just the
active entries.

Also clean up the "qps" file handlers and shared common code.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:33 -07:00
Steve Wise 05fb962947 RDMA/cxgb4: Log HW lack-of-resource errors
This helps debug cases where HW resources are depleted.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:33 -07:00
Steve Wise 0e42c1f430 RDMA/cxgb4: Handle CPL_RDMA_TERMINATE messages
T4 FW sends up CPL_RDMA_TERMINATE to indicate a peer TERM.  This
triggers the QP moving to TERMINATE state.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:32 -07:00
Steve Wise 6ff0e343b3 RDMA/cxgb4: Ignore TERMINATE CQEs
T4 incorrectly inserts TERM CQEs into the CQ.  Silently ignore them.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:31 -07:00
Steve Wise 7459486133 RDMA/cxgb4: Ignore positive return values from cxgb4_*_send() functions
The cxgb4_*_send() functions return NET_XMIT_ values, which are
positive integers or negative errno values.  So don't treat positive
return values as an error.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:31 -07:00
Steve Wise 13fecb83b4 RDMA/cxgb4: Zero out ISGL padding
The HW design requires zeroing any pad in SGLs.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:30 -07:00
Steve Wise af93fb5dcc RDMA/cxgb4: Don't use null ep ptr
In c4iw_modify_qp() error path, only use qhp->ep if ep is not already set.
Otherwise qhp->ep can be NULL and we crash.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:29 -07:00
Roland Dreier c8e081a1bf RDMA/cxgb4: Fix warnings about casts to/from pointers of different sizes
Fix:

  drivers/infiniband/hw/cxgb4/qp.c: In function ‘create_qp’:
  drivers/infiniband/hw/cxgb4/qp.c:147: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/qp.c: In function ‘rdma_fini’:
  drivers/infiniband/hw/cxgb4/qp.c:988: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/qp.c: In function ‘rdma_init’:
  drivers/infiniband/hw/cxgb4/qp.c:1063: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/mem.c: In function ‘write_adapter_mem’:
  drivers/infiniband/hw/cxgb4/mem.c:74: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/cq.c: In function ‘destroy_cq’:
  drivers/infiniband/hw/cxgb4/cq.c:58: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/cq.c: In function ‘create_cq’:
  drivers/infiniband/hw/cxgb4/cq.c:135: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/cm.c: In function ‘fw6_msg’:
  drivers/infiniband/hw/cxgb4/cm.c:2326: warning: cast to pointer from integer of different size

by casting pointers to unsigned long instead of u64.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-27 17:51:04 -07:00
Steve Wise 93fb72e443 RDMA/cxgb4: Obtain RDMA QID ranges from LLD/FW
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-07 23:08:47 -07:00
Linus Torvalds 3cc08fc35d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (42 commits)
  IB/qib: Add missing <linux/slab.h> include
  IB/ehca: Drop unnecessary NULL test
  RDMA/nes: Fix confusing if statement indentation
  IB/ehca: Init irq tasklet before irq can happen
  RDMA/nes: Fix misindented code
  RDMA/nes: Fix showing wqm_quanta
  RDMA/nes: Get rid of "set but not used" variables
  RDMA/nes: Read firmware version from correct place
  IB/srp: Export req_lim via sysfs
  IB/srp: Make receive buffer handling more robust
  IB/srp: Use print_hex_dump()
  IB: Rename RAW_ETY to RAW_ETHERTYPE
  RDMA/nes: Fix two sparse warnings
  RDMA/cxgb3: Make needlessly global iwch_l2t_send() static
  IB/iser: Make needlessly global iser_alloc_rx_descriptors() static
  RDMA/cxgb4: Add timeouts when waiting for FW responses
  IB/qib: Fix race between qib_error_qp() and receive packet processing
  IB/qib: Limit the number of packets processed per interrupt
  IB/qib: Allow writes to the diag_counters to be able to clear them
  IB/qib: Set cfgctxts to number of CPUs by default
  ...
2010-08-07 17:08:02 -07:00
Linus Torvalds 3cfc2c42c1 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (48 commits)
  Documentation: update broken web addresses.
  fix comment typo "choosed" -> "chosen"
  hostap:hostap_hw.c Fix typo in comment
  Fix spelling contorller -> controller in comments
  Kconfig.debug: FAIL_IO_TIMEOUT: typo Faul -> Fault
  fs/Kconfig: Fix typo Userpace -> Userspace
  Removing dead MACH_U300_BS26
  drivers/infiniband: Remove unnecessary casts of private_data
  fs/ocfs2: Remove unnecessary casts of private_data
  libfc: use ARRAY_SIZE
  scsi: bfa: use ARRAY_SIZE
  drm: i915: use ARRAY_SIZE
  drm: drm_edid: use ARRAY_SIZE
  synclink: use ARRAY_SIZE
  block: cciss: use ARRAY_SIZE
  comment typo fixes: charater => character
  fix comment typos concerning "challenge"
  arm: plat-spear: fix typo in kerneldoc
  reiserfs: typo comment fix
  update email address
  ...
2010-08-04 15:31:02 -07:00
Linus Torvalds 6ba74014c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits)
  phy/marvell: add 88ec048 support
  igb: Program MDICNFG register prior to PHY init
  e1000e: correct MAC-PHY interconnect register offset for 82579
  hso: Add new product ID
  can: Add driver for esd CAN-USB/2 device
  l2tp: fix export of header file for userspace
  can-raw: Fix skb_orphan_try handling
  Revert "net: remove zap_completion_queue"
  net: cleanup inclusion
  phy/marvell: add 88e1121 interface mode support
  u32: negative offset fix
  net: Fix a typo from "dev" to "ndev"
  igb: Use irq_synchronize per vector when using MSI-X
  ixgbevf: fix null pointer dereference due to filter being set for VLAN 0
  e1000e: Fix irq_synchronize in MSI-X case
  e1000e: register pm_qos request on hardware activation
  ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
  net: Add getsockopt support for TCP thin-streams
  cxgb4: update driver version
  cxgb4: add new PCI IDs
  ...

Manually fix up conflicts in:
 - drivers/net/e1000e/netdev.c: due to pm_qos registration
   infrastructure changes
 - drivers/net/phy/marvell.c: conflict between adding 88ec048 support
   and cleaning up the IDs
 - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req
   conflict (registration change vs marking it static)
2010-08-04 11:47:58 -07:00
Steve Wise a5f4a07820 RDMA/cxgb4: Add timeouts when waiting for FW responses
Don't hang a host thread if the FW stops responding.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-04 09:54:42 -07:00
Jiri Kosina d790d4d583 Merge branch 'master' into for-next 2010-08-04 15:14:38 +02:00
Steve Wise ca5a22028d RDMA/cxgb4: Set/reset the EP timer inside EP lock
Endpoint timer manipulation needs to be done inside the lock.  Otherwise
we can get into a situation where a timer is stopped before it is started,
which hits the WARN_ON() in stop_ep_timer().

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-02 21:06:17 -07:00
Steve Wise d4f1a5c6ef RDMA/cxgb4: Use correct control txq
There is only one control txq per tx channel.  So use the port number
as the queue index when sending.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-02 21:06:12 -07:00
Steve Wise 73d6fcad2a RDMA/cxgb4: Fix race in fini path
There exists a race condition where the app disconnects, which
initiates an orderly close (via rdma_fini()), concurrently with an
ingress abort condition, which initiates an abortive close operation.
Since rdma_fini() must be called without IRQs disabled, the fini can
be called after the QP has been transitioned to ERROR.  This is ok,
but we need to protect against qp->ep getting NULLed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-02 21:06:06 -07:00
Steve Wise d37ac31ddc RDMA/cxgb4: Support variable sized work requests
T4 EQ entries are in multiples of 64 bytes.  Currently the RDMA SQ and
RQ use fixed sized entries composed of 4 EQ entries for the SQ and 2
EQ entries for the RQ.  For optimial latency with small IO, we need to
change this so the HW only needs to DMA the EQ entries actually used
by a given work request.

Implementation:

- add wq_pidx counter to track where we are in the EQ.  cidx/pidx are
  used for the sw sq/rq tracking and flow control.

- the variable part of work requests is the SGL.  Add new functions to
  build the SGL and/or immediate data directly in the EQ memory
  wrapping when needed.

- adjust the min burst size for the EQ contexts to 64B.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 11:16:20 -07:00
David Rientjes d3c814e8b2 RDMA/cxgb4: Remove dependency on __GFP_NOFAIL
The alloc_skb() in various allocations are failable, so remove
__GFP_NOFAIL from their masks.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 10:55:05 -07:00
Steve Wise ba6d39256b RDMA/cxgb4: Add module option to tweak delayed ack
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 10:53:52 -07:00
Roland Dreier 85963e4cbc RDMA/cxgb4: Remove unneeded NULL check
The rest of the code seems to assume that ep->com.cm_id can't be NULL,
so remove an unneeded test.

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-19 13:13:09 -07:00
Dan Carpenter c1d7356c85 RDMA/cxgb4: Remove unneeded assignment
We don't need to assign rpl here, we do that later on.

Signed-off-by: Dan Carpenter <error27@gmail.com>

[ Indeed this assignment makes no sense, since skb is set to NULL a
  couple of lines before.  - Roland ]

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-19 13:09:40 -07:00
Steve Wise 2c5934bfc5 RDMA/cxgb4: Derive smac_idx from port viid
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:05:16 -07:00
Steve Wise 1973e8b8ed RDMA/cxgb4: Avoid false GTS CIDX_INC overflows
The T4 IQ hw design assumes CIDX_INC credits will be returned on a
regular basis and always before the CIDX counter crosses over the PIDX
counter.  For RDMA CQs, however, returning CIDX_INC credits is only
needed and desired when and if the CQ is armed for notification.  This
can lead to a GTS write returning credits that causes the HW to reject
the credit update because it causes CIDX to pass PIDX.  Once this
happens, the CIDX/PIDX counters get out of whack and an application
can miss a notification and get stuck blocked awaiting a notification.

To avoid this, we allocate the HW IQ 2x times the requested size.
This seems to avoid the false overflow failures.  If we see more
issues with this, then we'll have to add code in the poll path to
return credits periodically like when the amount reaches 1/2 the queue
depth).  I would like to avoid this as it adds a PCI write transaction
for applications that never arm the CQ (like most MPIs).

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:04:04 -07:00
Steve Wise b21ef16a8b RDMA/cxgb4: Don't call abort_connection() for active connect failures
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:02:54 -07:00
FUJITA Tomonori f38926aa1d RDMA/cxgb4: Use the DMA state API instead of the pci equivalents
This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:01:42 -07:00
Jiri Kosina f1bbbb6912 Merge branch 'master' into for-next 2010-06-16 18:08:13 +02:00
Uwe Kleine-König 732bee7af3 fix typos concerning "hierarchy"
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-06-16 18:03:14 +02:00
Changli Gao d8d1f30b95 net-next: remove useless union keyword
remove useless union keyword in rtable, rt6_info and dn_route.

Since there is only one member in a union, the union keyword isn't useful.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10 23:31:35 -07:00
Roland Dreier acdc30b56a Merge branches 'cxgb4', 'misc', 'mlx4', 'nes' and 'qib' into for-next 2010-05-25 09:54:03 -07:00
Steve Wise 30a6a62fc3 RDMA/cxgb4: Only insert sq qid in lookup table
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:05 -07:00
Steve Wise 2f1fb507ee RDMA/cxgb4: Support IB_WR_READ_WITH_INV opcode
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:04 -07:00
Steve Wise 4ab1eb9c8d RDMA/cxgb4: Set fence flag for inv-local-stag work requests
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:04 -07:00
Steve Wise f64b88433c RDMA/cxgb4: Update some HW limits
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:03 -07:00
Steve Wise 25737bd4ca RDMA/cxgb4: Don't limit fastreg page list depth
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:03 -07:00
Steve Wise 841dba9a5a RDMA/cxgb4: Return proper errors in fastreg mr/pbl allocation
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:02 -07:00
Steve Wise 7ec45b9234 RDMA/cxgb4: Fix overflow bug in CQ arm
- wrap cq->cqidx_inc based on cq size.
- optimize t4_arm_cq logic.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:01 -07:00
Steve Wise 84172dee05 RDMA/cxgb4: Optimize CQ overflow detection
1) save the timestamp flit in the cq when we consume a CQE.

2) always compare the saved flit with the previous entry flit when
   reading the next CQE entry.  If the flits don't compare, then we
   have overflowed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:01 -07:00
Steve Wise 895cf5f3d6 RDMA/cxgb4: CQ size must be IQ size - 2
We need 1 extra entry for the status page and 1 to always have 1 free
entry to detect when the queue is full.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:00 -07:00
Steve Wise 1c01c53883 RDMA/cxgb4: Register RDMA provider based on LLD state_change events
The LLD now supports proper UP state change events, so move the RDMA
provider registration to UP path.

This fixes a crash when loading iw_cxgb4 _after_ the NFS/RDMA
transport is up and running.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:07:59 -07:00
Steve Wise fd388ce677 RDMA/cxgb4: Detach from the LLD after unregistering RDMA device
In the RDMA core unregister path, kernel users will be calling down
into the T4 provider to release resources.  So we cannot detach from
the LLD until this process completes.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:07:59 -07:00
Ralph Campbell 9a6edb60ec IB/core: Allow device-specific per-port sysfs files
Add a new parameter to ib_register_device() so that low-level device
drivers can pass in a pointer to a callback function that will be
called for each port that is registered in sysfs.  This allows
low-level device drivers to create files in

    /sys/class/infiniband/<hca>/ports/<N>/

without having to poke through the internals of the RDMA sysfs handling.

There is no need for an unregister function since the kobject
reference will go to zero when ib_unregister_device() is called.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-21 10:34:44 -07:00
Roland Dreier be4c9bad9d MAINTAINERS: Add cxgb4 and iw_cxgb4 entries
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-05 14:45:40 -07:00
Steve Wise cfdda9d764 RDMA/cxgb4: Add driver for Chelsio T4 RNIC
Add an RDMA/iWARP driver for Chelsio T4 Ethernet adapters.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 15:30:06 -07:00