Commit graph

3035 commits

Author SHA1 Message Date
Or Gerlitz
c938a616aa IB/core: Add raw packet QP type
IB_QPT_RAW_PACKET allows applications to build a complete packet,
including L2 headers, when sending; on the receive side, the HW will
not strip any headers.

This QP type is designed for userspace direct access to Ethernet; for
example by applications that do TCP/IP themselves.  Only processes
with the NET_RAW capability are allowed to create raw packet QPs (the
name "raw packet QP" is supposed to suggest an analogy to AF_PACKET /
SOL_RAW sockets).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:18:09 -07:00
Roland Dreier
349556692d RDMA/ocrdma: Fix build with IPV6=n
When IPV6 is not enabled:

    ERROR: "register_inet6addr_notifier" [drivers/infiniband/hw/ocrdma/ocrdma.ko] undefined!
    ERROR: "unregister_inet6addr_notifier" [drivers/infiniband/hw/ocrdma/ocrdma.ko] undefined!

Fix this by wrapping the inet6 calls in #ifdef IPV6.  Also make the
ocrdma module depend on (IPV6 || IPV6=n) to forbid the case of modular
ipv6 but built-in ocrdma (which can't work, because ocrdma calls ipv6
functions).

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:49 -07:00
Dan Carpenter
d19081e044 RDMA/ocrdma: Tiny locking cleanup
We only need to disable the IRQs one time.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

[ Rename "wq_flags" to more conventional "flags."  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:49 -07:00
Dan Carpenter
55a8d62a3b RDMA/ocrdma: Fix check for NULL instead of IS_ERR
The ocrdma_alloc_lkey() function never returns NULL pointers -- it
returns ERR_PTRs.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:49 -07:00
Sasha Levin
3e4d60a82e RDMA/ocrdma: Don't sleep in atomic notifier handler
Events sent to ocrdma_inet6addr_event() are sent from an atomic context,
therefore we can't try to lock a mutex within the notifier callback.

We could just switch the mutex to a spinlock since all it does it
protect a list, but I've gone ahead and switched the list to use RCU
instead.  I couldn't fully test it since I don't have IB hardware, so
if it doesn't fully work for some reason let me know and I'll switch
it back to using a spinlock.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>

[ Fixed locking in ocrdma_add().  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:48 -07:00
Roland Dreier
c592c42331 RDMA/ocrdma: Remove write-only variables
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:48 -07:00
Roland Dreier
e9db29534d RDMA/ocrdma: Set event's device member in ocrdma_dispatch_ibevent()
We need to set ib_evt.device, or else ib_dispatch_event() will crash
when we call it for unaffiliated events (and consumers may get
confused in their QP/CQ/SRQ event handler for affiliated events).

Also fix sparse warning:

    drivers/infiniband/hw/ocrdma/ocrdma_hw.c:678:36: warning: Using plain integer as NULL pointer

There's no need to clear ib_evt, since every member is initialized.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:48 -07:00
Roland Dreier
abe3afacc5 RDMA/ocrdma: Make needlessly global functions/structs static
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:48 -07:00
Roland Dreier
da4964387d RDMA/ocrdma: Fix warnings about uninitialized variables
First, fix

    drivers/infiniband/hw/ocrdma/ocrdma_verbs.c: In function 'ocrdma_alloc_pd':
    drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:371:17: warning: 'dpp_page_addr' may be used uninitialized in this function [-Wuninitialized]
    drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:337:6: note: 'dpp_page_addr' was declared here

which seems that it may border on a bug (the call to ocrdma_del_mmap()
might conceivably do bad things if pd->dpp_enabled is not set and
dpp_page_addr ends up with just the wrong value).

Also take care of:

    drivers/infiniband/hw/ocrdma/ocrdma_hw.c: In function 'ocrdma_init_hw':
    drivers/infiniband/hw/ocrdma/ocrdma_hw.c:2587:5: warning: 'status' may be used uninitialized in this function [-Wuninitialized]
    drivers/infiniband/hw/ocrdma/ocrdma_hw.c:2549:17: note: 'status' was declared here

which is only real if num_eq == 0, which should be impossible.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:47 -07:00
Parav Pandit
fe2caefcdf RDMA/ocrdma: Add driver for Emulex OneConnect IBoE RDMA adapter
Signed-off-by: Parav Pandit <parav.pandit@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:47 -07:00
Sean Hefty
b6cec8aa4a RDMA/cma: Fix lockdep false positive recursive locking
The following lockdep problem was reported by Or Gerlitz <ogerlitz@mellanox.com>:

    [ INFO: possible recursive locking detected ]
    3.3.0-32035-g1b2649e-dirty #4 Not tainted
    ---------------------------------------------
    kworker/5:1/418 is trying to acquire lock:
     (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0138a41>] rdma_destroy_i    d+0x33/0x1f0 [rdma_cm]

    but task is already holding lock:
     (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disable_ca    llback+0x24/0x45 [rdma_cm]

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(&id_priv->handler_mutex);
      lock(&id_priv->handler_mutex);

     *** DEADLOCK ***

     May be due to missing lock nesting notation

    3 locks held by kworker/5:1/418:
     #0:  (ib_cm){.+.+.+}, at: [<ffffffff81042ac1>] process_one_work+0x210/0x4a    6
     #1:  ((&(&work->work)->work)){+.+.+.}, at: [<ffffffff81042ac1>] process_on    e_work+0x210/0x4a6
     #2:  (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disab    le_callback+0x24/0x45 [rdma_cm]

    stack backtrace:
    Pid: 418, comm: kworker/5:1 Not tainted 3.3.0-32035-g1b2649e-dirty #4
    Call Trace:
     [<ffffffff8102b0fb>] ? console_unlock+0x1f4/0x204
     [<ffffffff81068771>] __lock_acquire+0x16b5/0x174e
     [<ffffffff8106461f>] ? save_trace+0x3f/0xb3
     [<ffffffff810688fa>] lock_acquire+0xf0/0x116
     [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff81364351>] mutex_lock_nested+0x64/0x2ce
     [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffff81065abc>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffffa0138a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffffa0139c02>] cma_req_handler+0x418/0x644 [rdma_cm]
     [<ffffffffa012ee88>] cm_process_work+0x32/0x119 [ib_cm]
     [<ffffffffa0130299>] cm_req_handler+0x928/0x982 [ib_cm]
     [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm]
     [<ffffffffa0130326>] cm_work_handler+0x33/0xfe5 [ib_cm]
     [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm]
     [<ffffffff81042b6e>] process_one_work+0x2bd/0x4a6
     [<ffffffff81042ac1>] ? process_one_work+0x210/0x4a6
     [<ffffffff813669f3>] ? _raw_spin_unlock_irq+0x2b/0x40
     [<ffffffff8104316e>] worker_thread+0x1d6/0x350
     [<ffffffff81042f98>] ? rescuer_thread+0x241/0x241
     [<ffffffff81046a32>] kthread+0x84/0x8c
     [<ffffffff8136e854>] kernel_thread_helper+0x4/0x10
     [<ffffffff81366d59>] ? retint_restore_args+0xe/0xe
     [<ffffffff810469ae>] ? __init_kthread_worker+0x56/0x56
     [<ffffffff8136e850>] ? gs_change+0xb/0xb

The actual locking is fine, since we're dealing with different locks,
but from the same lock class.  cma_disable_callback() acquires the
listening id mutex, whereas rdma_destroy_id() acquires the mutex for
the new connection id.  To fix this, delay the call to
rdma_destroy_id() until we've released the listening id mutex.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:34 -07:00
Roland Dreier
5909ce545d IB/uverbs: Lock SRQ / CQ / PD objects in a consistent order
Since XRC support was added, the uverbs code has locked SRQ, CQ and PD
objects needed during QP and SRQ creation in different orders
depending on the the code path.  This leads to the (at least
theoretical) possibility of deadlock, and triggers the lockdep splat
below.

Fix this by making sure we always lock the SRQ first, then CQs and
finally the PD.

    ======================================================
    [ INFO: possible circular locking dependency detected ]
    3.4.0-rc5+ #34 Not tainted
    -------------------------------------------------------
    ibv_srq_pingpon/2484 is trying to acquire lock:
     (SRQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]

    but task is already holding lock:
     (CQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #2 (CQ-uobj){+++++.}:
           [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
           [<ffffffff81384f28>] down_read+0x34/0x43
           [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
           [<ffffffffa00b16c3>] ib_uverbs_create_qp+0x180/0x684 [ib_uverbs]
           [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
           [<ffffffff810fe47f>] vfs_write+0xa7/0xee
           [<ffffffff810fe65f>] sys_write+0x45/0x69
           [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

    -> #1 (PD-uobj){++++++}:
           [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
           [<ffffffff81384f28>] down_read+0x34/0x43
           [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
           [<ffffffffa00af8ad>] __uverbs_create_xsrq+0x96/0x386 [ib_uverbs]
           [<ffffffffa00b31b9>] ib_uverbs_detach_mcast+0x1cd/0x1e6 [ib_uverbs]
           [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
           [<ffffffff810fe47f>] vfs_write+0xa7/0xee
           [<ffffffff810fe65f>] sys_write+0x45/0x69
           [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

    -> #0 (SRQ-uobj){+++++.}:
           [<ffffffff81070898>] __lock_acquire+0xa29/0xd06
           [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
           [<ffffffff81384f28>] down_read+0x34/0x43
           [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
           [<ffffffffa00b1728>] ib_uverbs_create_qp+0x1e5/0x684 [ib_uverbs]
           [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
           [<ffffffff810fe47f>] vfs_write+0xa7/0xee
           [<ffffffff810fe65f>] sys_write+0x45/0x69
           [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

    other info that might help us debug this:

    Chain exists of:
      SRQ-uobj --> PD-uobj --> CQ-uobj

     Possible unsafe locking scenario:

           CPU0                    CPU1
           ----                    ----
      lock(CQ-uobj);
                                   lock(PD-uobj);
                                   lock(CQ-uobj);
      lock(SRQ-uobj);

     *** DEADLOCK ***

    3 locks held by ibv_srq_pingpon/2484:
     #0:  (QP-uobj){+.+...}, at: [<ffffffffa00b162c>] ib_uverbs_create_qp+0xe9/0x684 [ib_uverbs]
     #1:  (PD-uobj){++++++}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
     #2:  (CQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]

    stack backtrace:
    Pid: 2484, comm: ibv_srq_pingpon Not tainted 3.4.0-rc5+ #34
    Call Trace:
     [<ffffffff8137eff0>] print_circular_bug+0x1f8/0x209
     [<ffffffff81070898>] __lock_acquire+0xa29/0xd06
     [<ffffffffa00af37c>] ? __idr_get_uobj+0x20/0x5e [ib_uverbs]
     [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
     [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffff81070eee>] ? lock_release+0x166/0x189
     [<ffffffff81384f28>] down_read+0x34/0x43
     [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
     [<ffffffffa00b1728>] ib_uverbs_create_qp+0x1e5/0x684 [ib_uverbs]
     [<ffffffff81070fec>] ? lock_acquire+0xdb/0xfe
     [<ffffffff81070c09>] ? lock_release_non_nested+0x94/0x213
     [<ffffffff810d470f>] ? might_fault+0x40/0x90
     [<ffffffff810d470f>] ? might_fault+0x40/0x90
     [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
     [<ffffffff810fe47f>] vfs_write+0xa7/0xee
     [<ffffffff810ff736>] ? fget_light+0x3b/0x99
     [<ffffffff810fe65f>] sys_write+0x45/0x69
     [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:34 -07:00
Roland Dreier
3bea57a5fc IB/uverbs: Make lockdep output more readable
Add names for our lockdep classes, so instead of having to decipher
lockdep output with mysterious names:

    Chain exists of:
      key#14 --> key#11 --> key#13

lockdep will give us something nicer:

    Chain exists of:
      SRQ-uobj --> PD-uobj --> CQ-uobj

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:34 -07:00
Mike Marciniszyn
464357a759 IB/ipath: Replace open-coded ARRAY_SIZE with macro
Change sizeof(array)/sizeof(array[0]) to ARRAY_SIZE.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:20 -07:00
Jim Cromie
6475f1df27 IB/ipath: Replace open-coded ARRAY_SIZE with macro
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:20 -07:00
Steve Wise
bd61baaf59 RDMA/cxgb4: Use dst parameter in import_ep()
Function import_ep() is incorrectly using ep->dst instead of the dst
ptr passed in.  This causes a crash when accepting new rdma connections
becase ep->dst is not initialized yet.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:04 -07:00
Or Gerlitz
c3bccbfbb7 IB/core: Use qp->usecnt to track multicast attach/detach
Just as we don't allow PDs, CQs, etc. to be destroyed if there are QPs
that are attached to them, don't let a QP be destroyed if there are
multicast group(s) attached to it.  Use the existing usecnt field of
struct ib_qp which was added by commit 0e0ec7e ("RDMA/core: Export
ib_open_qp() to share XRC TGT QPs") to track this.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:16:54 -07:00
David S. Miller
0d6c4a2e46 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/intel/e1000e/param.c
	drivers/net/wireless/iwlwifi/iwl-agn-rx.c
	drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
	drivers/net/wireless/iwlwifi/iwl-trans.h

Resolved the iwlwifi conflict with mainline using 3-way diff posted
by John Linville and Stephen Rothwell.  In 'net' we added a bug
fix to make iwlwifi report a more accurate skb->truesize but this
conflicted with RX path changes that happened meanwhile in net-next.

In e1000e a conflict arose in the validation code for settings of
adapter->itr.  'net-next' had more sophisticated logic so that
logic was used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-07 23:35:40 -04:00
Linus Torvalds
ebcf596d89 A few fixes for regressions introduced in 3.4-rc1:
- fix memory leak in mlx4
  - fix two problems with new MAD response generation code
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCAAGBQJPmYfMAAoJEENa44ZhAt0h06IP/jPJeH855wC2RgWXY2Fd+b6w
 K/DQVE2MaX2f8EHuAB5MKoC0zzchKNu0GTMbrZMizbocDBmIU6ibF3MKx2rycSBC
 FI906X7FFAeSyaywrI+k+ZGxVQYAck1QFDr44WiyJ400r8BGbx9ZECKvIyEwjaUa
 RpRn4Pui4IjMV7GFqiziUsa9uo4xKYagjNfeInAZXelzIN76Coeae28dAIE2rPT/
 BTcrtQ2JMG6H4AVGTImTP+Kqk1Cm7apBeHS4+SGWWTdw1WH4IcMqPmxHBUUtKTWt
 UnlgpHquuC913VpkbkGeog32iAndhHgAkfs67Rs0IFwck127Il/DiPK/zir2zkZ2
 YVilzYIxrSUYqPZYfKFcrKsnjfM+HJxBEPCfDjxfG0TjrVS8tnc75Uq1NvIt6e42
 lcIclLDyZtdsMLmJlsTsT5cF7HlThSjpZDSgmMi2eeOI0RKoqRyyGGwiAIq7Ud2a
 oRa+bheh4pBD2lvzd62uEkgMkTaToAz3I/vlbOTSwxYg2skoHUpwNPCsmstsQ7RU
 BHi7Nl5DLNT5lNBTIouWvbilpXs8eXwgcNrQcWSYaPeY2aDsGAAdi9Q4BadENao9
 dmxOtqjryctX7U7LGDRmNDIfiyPY5AodDUJjFux1JwqbdSCGf1Lz7uvDWu7gqL9V
 TCJFAK54okexpoyV6Vgv
 =iRrT
 -----END PGP SIGNATURE-----

Merge tag 'ib-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull infiniband fixes from Roland Dreier:
 "A few fixes for regressions introduced in 3.4-rc1:
   - fix memory leak in mlx4
   - fix two problems with new MAD response generation code"

* tag 'ib-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mlx4: Fix memory leaks in ib_link_query_port()
  IB/mad: Don't send response for failed MADs
  IB/mad: Set 'D' bit in response for unhandled MADs
2012-04-26 15:35:35 -07:00
Roland Dreier
b609379f8d Merge branches 'mad-response' and 'mlx4' into fixes 2012-04-24 16:11:46 -07:00
Jesper Juhl
bf6b47deb4 IB/mlx4: Fix memory leaks in ib_link_query_port()
If the call to mlx4_MAD_IFC() fails in ib_link_query_port() we will
currently do 'return err;' which will leak 'in_mad' and 'out_mad'.  We
should instead do 'goto out;' where we'll properly free the memory we
previously allocated.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-04-24 16:11:21 -07:00
Jack Morgenstein
a9e7432319 IB/mad: Don't send response for failed MADs
Commit 0b30704304 ("IB/mad: Return error response for unsupported
MADs") does not failed MADs (eg those that return
IB_MAD_RESULT_FAILURE) properly -- these MADs should be silently
discarded. (We should not force the lower-layer drivers to return
SUCCESS | CONSUMED in this case, since the MAD is NOT successful).
Unsupported MADs are not failures -- they return SUCCESS, but with an
"unsupported error" status value inside the response MAD.

Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-04-24 16:08:57 -07:00
Jack Morgenstein
840777de53 IB/mad: Set 'D' bit in response for unhandled MADs
Commit 0b30704304 ("IB/mad: Return error response for unsupported
MADs") does not handle directed-route MADs properly -- it fails to set
the 'D' bit in the response MAD status field.  This is a problem for
SmInfo MADs when the receiver does not have an SM running.

Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-04-24 16:06:50 -07:00
David S. Miller
f24001941c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Fix merge between commit 3adadc08cc ("net ax25: Reorder ax25_exit to
remove races") and commit 0ca7a4c87d ("net ax25: Simplify and
cleanup the ax25 sysctl handling")

The former moved around the sysctl register/unregister calls, the
later simply removed them.

With help from Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-23 23:15:17 -04:00
Eric W. Biederman
ec8f23ce0f net: Convert all sysctl registrations to register_net_sysctl
This results in code with less boiler plate that is a bit easier
to read.

Additionally stops us from using compatibility code in the sysctl
core, hastening the day when the compatibility code can be removed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-20 21:22:30 -04:00
Eric W. Biederman
5dd3df105b net: Move all of the network sysctls without a namespace into init_net.
This makes it clearer which sysctls are relative to your current network
namespace.

This makes it a little less error prone by not exposing sysctls for the
initial network namespace in other namespaces.

This is the same way we handle all of our other network interfaces to
userspace and I can't honestly remember why we didn't do this for
sysctls right from the start.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-20 21:21:17 -04:00
Andy Grover
a12f41f841 target: Rename target_allocate_tasks to target_setup_cmd_from_cdb
This patch renames a horribly misnamed function that no longer allocate
tasks to something more descriptive for it's modern use in target core.

(nab: Fix up ib_srpt to use this as well ahead of a target_submit_cmd
conversion)

Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-04-14 17:40:36 -07:00
Roland Dreier
6f9e7f01b6 IB/srpt: Remove use of transport_do_task_sg_chain()
With the modern target core, se_cmd->t_data_sg already points to a
sglist that covers the whole command.  So task_sg chaining is needless
overhead and obfuscation -- instead of splicing the split up task
sglists back into one list, we can just use the original list directly.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-04-14 17:40:32 -07:00
Linus Torvalds
4166fb6459 Add a fix for a bug hit by Alexey Shvetsov in ib_srtp that hits on
non-mlx4 hardware.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCAAGBQJPh2i0AAoJEENa44ZhAt0hf3UQAJkb0Ol8earKatr3qWuDpWtf
 aQ5AvdRIWAnxXrIc0pDagxgQ7mF/9OFuuRsJCnoyxbhlLGuDcBEs0b9fm5arhKCz
 0fNTxcZSftpO7mNDjzcS71rsRga5GLZcKnJ/I1AWLpG8xJ5nPSfOzwTnC+U/kczm
 GnAhcNN1pR7jmUXxfiafY4VFWcNQqhLQTa+TZdlzdPXoSfxk6M8RyGzVOn99V2rR
 XSYo7xO0Zww5WKSgBmULwGJ3h1wqWXqA9GZrQnCVJ53BzNRt/oxrumEoKBohdnVR
 7Ce8Ts8oVRr99RmAqTLKTzy7A7SJJpsrHUmYWDl7aDM5LJyFvk75chr0b0u7IaIC
 SRBmdFTWbT+JX8lu4gGhwk9GBl6aMePnaUkOpbt2RbYVgVChUFHZD6mvmvA589JD
 PspnQWb+FarM6wT8rXT2RXAxGNCZFj1W8jz120HFAi8hcWxMVPMWWCU/dWMfJsMS
 42YfleWqmstPD9y5ir3Mj7kcrCTi1NXbIAoR4a255DNgD3ih79QzYT/W0Ln47D+P
 XAEjPR0L5D3f4pbAAB3ogB3Z3sejRGi8DLK0bO3PsuvjkXq+XoSK+gunnVQwl8u4
 q4vffLr6EMBUs6AZy+VNPAwzOwSojGfzWD0k7S/iFsJmY4zHTWtKjcVVW8X3Ivw4
 HMZDpHkXI+rRICOF24A1
 =fWhE
 -----END PGP SIGNATURE-----

Merge tag 'srpt-srq-type' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull infiniband fix from Roland Dreier:
 "Add a fix for a bug hit by Alexey Shvetsov in ib_srtp that hits on
  non-mlx4 hardware."

* tag 'srpt-srq-type' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/srpt: Set srq_type to IB_SRQT_BASIC
2012-04-12 18:51:32 -07:00
David S. Miller
011e3c6325 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-12 19:41:23 -04:00
Roland Dreier
6f3603367b IB/srpt: Set srq_type to IB_SRQT_BASIC
Since commit 96104eda01 ("RDMA/core: Add SRQ type field"), kernel
users of SRQs need to specify srq_type = IB_SRQT_BASIC in struct
ib_srq_init_attr, or else most low-level drivers will fail in
when srpt_add_one() calls ib_create_srq() and gets -ENOSYS.

(mlx4_ib works OK nearly all of the time, because it just needs
srq_type != IB_SRQT_XRC.  And apparently nearly everyone using
ib_srpt is using mlx4 hardware)

Reported-by: Alexey Shvetsov <alexxy@gentoo.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-04-12 07:56:57 -07:00
David S. Miller
06eb4eafbd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-10 14:30:45 -04:00
Amir Vadai
366cddb402 IB/rdma_cm: TOS <=> UP mapping for IBoE
Both tagged traffic and untagged traffic use tc tool mapping.
Treat RDMA TOS same as IP TOS when mapping to SL

Signed-off-by: Amir Vadai <amirv@mellanox.com>
CC: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05 05:08:04 -04:00
Roland Dreier
0559d8dc13 IB/core: Don't return EINVAL from sysfs rate attribute for invalid speeds
Commit e9319b0cb0 ("IB/core: Fix SDR rates in sysfs") changed our
sysfs rate attribute to return EINVAL to userspace if the underlying
device driver returns an invalid rate.  Apparently some drivers do this
when the link is down and some userspace pukes if it gets an error when
reading this attribute, so avoid a regression by not return an error to
match the old code.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-04-02 10:57:31 -07:00
Or Gerlitz
d2ef406866 IB/mlx4: Don't return an invalid speed when a port is down
When the IB port is down, the active_speed value returned by the
MAD_IFC command is seven (7) which isn't among the defined IB speeds
in enum ib_port_speed, and this invalid speed value is passed up to
higher layers or applications who do port query.

Fix that by setting the speed to be SDR -- the lowest possible -- when
the port is down.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-04-02 10:55:24 -07:00
David S. Miller
4e24ffa4d9 infiniband: Stop using NLA_PUT*().
These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-02 04:33:42 -04:00
David Howells
9ffc93f203 Remove all #inclusions of asm/system.h
Remove all #inclusions of asm/system.h preparatory to splitting and killing
it.  Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`

Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-28 18:30:03 +01:00
Linus Torvalds
1ab142d499 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "This contains the usual set of updates and bugfixes to target-core +
  existing fabric module code, along with a handful of the patches
  destined for v3.3 stable.

  It also contains the necessary target-core infrastructure pieces
  required to run using tcm_qla2xxx.ko WWPNs with the new Qlogic Fibre
  Channel fabric module currently queued in target-pending/for-next-merge,
  and coming for round 2.

  The highlights for this series include:

   - Add target_submit_tmr() helper function for fabric task management
     (andy)
   - Convert tcm_fc to use target_submit_tmr() (andy)
   - Replace target core various cmd flags with a transport state (hch)
   - Convert loopback to use workqueue submission (hch)
   - Convert target core to use array_zalloc for tpg_lun_list (joern)
   - Convert target core to use array_zalloc for device_list (joern)
   - Add target core support for TMR_ABORT_TASK (nab)
   - Add target core se_sess->sess_kref + get/put helpers (nab)
   - Add target core se_node_acl->acl_kref for ->acl_free_comp usage
     (nab)
   - Convert iscsi-target to use target_put_session + sess_kref (nab)
   - Fix tcm_fc fc_exch memory leak in ft_send_resp_status (nab)
   - Fix ib_srpt srpt_handle_cmd send_ioctx->ioctx_kref leak on
     exception (nab)
   - Fix target core up handling of short INQUIRY buffers (roland)
   - Untangle target-core front-end and back-end meanings of max_sectors
     attribute (roland)
   - Set loopback residual field for SCSI commands (roland)
   - Fix target-core 16-bit target ports for SET TARGET PORT GROUPS
     emulation (roland)

  Thanks again to Andy, Christoph, Joern, Roland, and everyone who has
  contributed this round!"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (64 commits)
  ib_srpt: Fix srpt_handle_cmd send_ioctx->ioctx_kref leak on exception
  loopback: Fix transport_generic_allocate_tasks error handling
  iscsi-target: remove improper externs
  iscsi-target: Remove unused variables in iscsi_target_parameters.c
  target: remove obvious warnings
  target: Use array_zalloc for device_list
  target: Use array_zalloc for tpg_lun_list
  target: Fix sense code for unsupported SERVICE ACTION IN
  target: Remove hack to make READ CAPACITY(10) lie if thin provisioning is enabled
  target: Bump core version to v4.1.0-rc2-ml + fabric versions
  tcm_fc: Fix fc_exch memory leak in ft_send_resp_status
  target: Drop unused legacy target_core_fabric_ops API callers
  iscsi-target: Convert to use target_put_session + sess_kref
  target: Convert se_node_acl->acl_group removal to use ->acl_kref
  target: Add se_node_acl->acl_kref for ->acl_free_comp usage
  target: Add se_node_acl->acl_free_comp for NodeACL release path
  target: Add se_sess->sess_kref + get/put helpers
  target: Convert session_lock to irqsave
  target: Fix typo in drivers/target
  iscsi-target: Fix dynamic -> explict NodeACL pointer reference
  ...
2012-03-22 12:38:04 -07:00
Linus Torvalds
0c2fe82a9b InfiniBand/RDMA changes for the 3.4 merge window. Nothing big really
stands out; by patch count lots of fixes to the mlx4 driver plus some
 cleanups and fixes to the core and other drivers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCAAGBQJPZ2THAAoJEENa44ZhAt0hMgkP/AwXjwzz6lYlIizytQ5KOH11
 CT/VoHLIGIwY31aRhZRHggWLEbJi8akgfh04K9PiSh/muFRPYk5KJDoQEoJV5Odv
 12krqtpZeL41YcAPq0wiyP3Ki5AZDCDumzWbOtmxE9m93mVfi3yFTTWDHFo/7SOx
 A2kUITdKuZhgBpCFYS23Gs8QTWcMPUlwNlM76edG90BjSySHsK15tbqTUaEOC6Ux
 SPG0p0c2YjlZCPmRObrCL65o5LFRVajinZ4rXN7rre4LEU+IuXgW+mQU2Kwy8z6a
 oMPE2l4OMSqj270r2PlVK087gGH69nKfLgLQLJiP0Id+KwdYUk7vQcA+Fu0jwjP3
 Ci/ahllvB76IiaNbax0vTy2ohCJmilLLotAP46NW0vPR2/KnmVy0Mqk8pXQQddw4
 tnAFNnafn/MjTty8GwqyXR/enJZs29ePcSxcYnKjXYRIgG4ldejL2wktgSra8+gb
 3/qFag1HRgZzcICkXGpHj7uWJ2KDe++1m6KTb0THUmrjuiel6ATFZpYoeRed3GfS
 ZMqpjZvuXM8nA9CrKMkzm5vIiUFF8Qq0hUTLR38F8FiNEhmC08JqH5PqjJ3Z18if
 R1yG4W4xmp/0ICHg4zyg6LWV3YsweDD+jSCJpW+KNBvu3HtYb41HIlK+i2TTmtPD
 3etEZZRwROAyYKcwN01p
 =DbLx
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull InfiniBand/RDMA changes for the 3.4 merge window from Roland Dreier:
 "Nothing big really stands out; by patch count lots of fixes to the
  mlx4 driver plus some cleanups and fixes to the core and other
  drivers."

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (28 commits)
  mlx4_core: Scale size of MTT table with system RAM
  mlx4_core: Allow dynamic MTU configuration for IB ports
  IB/mlx4: Fix info returned when querying IBoE ports
  IB/mlx4: Fix possible missed completion event
  mlx4_core: Report thermal error events
  mlx4_core: Fix one more static exported function
  IB: Change CQE "csum_ok" field to a bit flag
  RDMA/iwcm: Reject connect requests if cmid is not in LISTEN state
  RDMA/cxgb3: Don't pass irq flags to flush_qp()
  mlx4_core: Get rid of redundant ext_port_cap flags
  RDMA/ucma: Fix AB-BA deadlock
  IB/ehca: Fix ilog2() compile failure
  IB: Use central enum for speed instead of hard-coded values
  IB/iser: Post initial receive buffers before sending the final login request
  IB/iser: Free IB connection resources in the proper place
  IB/srp: Consolidate repetitive sysfs code
  IB/srp: Use pr_fmt() and pr_err()/pr_warn()
  IB/core: Fix SDR rates in sysfs
  mlx4: Enforce device max FMR maps in FMR alloc
  IB/mlx4: Set bad_wr for invalid send opcode
  ...
2012-03-21 10:33:42 -07:00
Linus Torvalds
9f3938346a Merge branch 'kmap_atomic' of git://github.com/congwang/linux
Pull kmap_atomic cleanup from Cong Wang.

It's been in -next for a long time, and it gets rid of the (no longer
used) second argument to k[un]map_atomic().

Fix up a few trivial conflicts in various drivers, and do an "evil
merge" to catch some new uses that have come in since Cong's tree.

* 'kmap_atomic' of git://github.com/congwang/linux: (59 commits)
  feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal
  highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename]
  drbd: remove the second argument of k[un]map_atomic()
  zcache: remove the second argument of k[un]map_atomic()
  gma500: remove the second argument of k[un]map_atomic()
  dm: remove the second argument of k[un]map_atomic()
  tomoyo: remove the second argument of k[un]map_atomic()
  sunrpc: remove the second argument of k[un]map_atomic()
  rds: remove the second argument of k[un]map_atomic()
  net: remove the second argument of k[un]map_atomic()
  mm: remove the second argument of k[un]map_atomic()
  lib: remove the second argument of k[un]map_atomic()
  power: remove the second argument of k[un]map_atomic()
  kdb: remove the second argument of k[un]map_atomic()
  udf: remove the second argument of k[un]map_atomic()
  ubifs: remove the second argument of k[un]map_atomic()
  squashfs: remove the second argument of k[un]map_atomic()
  reiserfs: remove the second argument of k[un]map_atomic()
  ocfs2: remove the second argument of k[un]map_atomic()
  ntfs: remove the second argument of k[un]map_atomic()
  ...
2012-03-21 09:40:26 -07:00
Linus Torvalds
69a7aebcf0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree from Jiri Kosina:
 "It's indeed trivial -- mostly documentation updates and a bunch of
  typo fixes from Masanari.

  There are also several linux/version.h include removals from Jesper."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (101 commits)
  kcore: fix spelling in read_kcore() comment
  constify struct pci_dev * in obvious cases
  Revert "char: Fix typo in viotape.c"
  init: fix wording error in mm_init comment
  usb: gadget: Kconfig: fix typo for 'different'
  Revert "power, max8998: Include linux/module.h just once in drivers/power/max8998_charger.c"
  writeback: fix fn name in writeback_inodes_sb_nr_if_idle() comment header
  writeback: fix typo in the writeback_control comment
  Documentation: Fix multiple typo in Documentation
  tpm_tis: fix tis_lock with respect to RCU
  Revert "media: Fix typo in mixer_drv.c and hdmi_drv.c"
  Doc: Update numastat.txt
  qla4xxx: Add missing spaces to error messages
  compiler.h: Fix typo
  security: struct security_operations kerneldoc fix
  Documentation: broken URL in libata.tmpl
  Documentation: broken URL in filesystems.tmpl
  mtd: simplify return logic in do_map_probe()
  mm: fix comment typo of truncate_inode_pages_range
  power: bq27x00: Fix typos in comment
  ...
2012-03-20 21:12:50 -07:00
Cong Wang
2a156d094d infiniband: remove the second argument of k[un]map_atomic()
Acked-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20 21:48:18 +08:00
Roland Dreier
f0e88aeb19 Merge branches 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iser', 'mad', 'nes', 'qib', 'srp' and 'srpt' into for-next 2012-03-19 09:50:33 -07:00
Nicholas Bellinger
187e70a554 ib_srpt: Fix srpt_handle_cmd send_ioctx->ioctx_kref leak on exception
This patch addresses a bug in srpt_handle_cmd() failure handling where
send_ioctx->kref is being leaked with the local extra reference after init,
causing the expected kref_put() in srpt_handle_send_comp() to not be the final
call to invoke srpt_put_send_ioctx_kref() -> transport_generic_free_cmd() and
perform se_cmd descriptor memory release.

It also fixes a SCF_SCSI_RESERVATION_CONFLICT handling bug where this code
is incorrectly falling through to transport_handle_cdb_direct() after
invoking srpt_queue_status() to send SAM_STAT_RESERVATION_CONFLICT status.

Note this patch is for >= v3.3 mainline code, and current lio-core.git
code has already been converted to target_submit_cmd() + se_cmd->cmd_kref usage,
and internal ioctx->kref usage has been removed.  I'm including this patch
now into target-pending/for-next with a CC' for v3.3 stable.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Roland Dreier <roland@purestorage.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-03-17 22:13:48 -07:00
Roland Dreier
42872c7a5e Merge branches 'misc' and 'mlx4' into for-next
Conflicts:
	drivers/infiniband/hw/mlx4/main.c
	drivers/net/ethernet/mellanox/mlx4/main.c
	include/linux/mlx4/device.h
2012-03-12 16:25:28 -07:00
Or Gerlitz
a9c766bb75 IB/mlx4: Fix info returned when querying IBoE ports
To issue a port query, use the QUERY_(Ethernet)_PORT command instead
of the MAD_IFC command, since MAD_IFC attempts to query the firmware
IB SMA, which is irrelevant for IBoE ports.

This allows us to handle both 10Gb/s and 40Gb/s rates (e.g in sysfs),
using QDR speed (10Gb/s) and width of 1X or 4X.

Signed-off-by: Dotan Barak <dotanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-12 16:24:59 -07:00
Eli Cohen
3616f9cead IB/mlx4: Fix possible missed completion event
If an erroneous CQE is polled in the first iteration (i.e. npolled ==
0), we don't update the consumer index and hence the hardware could
get a wrong notion of how many CQEs software polled.  Fix this by
unconditionally updating the doorbell record.  We could change the
check to be something like

	if (npolled || err != -EAGAIN)
		...

but it does not seem worth the effort since a posted write to memory
should not cost too much.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-12 16:24:59 -07:00
Nicholas Bellinger
c7ec05c82b target: Drop unused legacy target_core_fabric_ops API callers
This patch drops the following unused legacy API callers from target_core_fabric.h:

*) TFO->fall_back_to_erl0()
*) TFO->stop_session()
*) TFO->sess_logged_in()
*) TFO->is_state_remove()

This patch also removes the stub usage in loopback, tcm_fc, iscsi_target,
and ib_srpt fabric modules.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-03-10 14:42:55 -08:00
Or Gerlitz
d927d505c5 IB: Change CQE "csum_ok" field to a bit flag
Use a bit in wc_flags rather then a whole integer to hold the
"checksum OK" flag.  By itself, this change doesn't reduce the size of
struct ib_wc on 64bit machines -- it stays on 56 bytes because of
padding.  However, it will allow to add more fields in the future
without enlarging the struct.  Also, it will let us have a unified
approach with future libibverbs checksum offload reporting, because a
bit flag doesn't break the library ABI.

This patch was suggested during conversation with Liran Liss
<liranl@mellanox.com>.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-08 12:34:27 -08:00
Steve Wise
3eae7c9f97 RDMA/iwcm: Reject connect requests if cmid is not in LISTEN state
When destroying a listening cmid, the iwcm first marks the state of
the cmid as DESTROYING, then releases the lock and calls into the
iWARP provider to destroy the endpoint.  Since the cmid is not locked,
its possible for the iWARP provider to pass a connection request event
to the iwcm, which will be silently dropped by the iwcm.  This causes
the iWARP provider to never free up the resources from this connection
because the assumption is the iwcm will accept or reject this connection.

The solution is to reject these connection requests.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-07 15:14:53 -08:00
Steve Wise
db4106ce63 RDMA/cxgb3: Don't pass irq flags to flush_qp()
Since flush_qp() is always called with irqs disabled, all the locking
inside flush_qp() and __flush_qp() doesn't need irq save/restore.

Further, passing the flag variable from iwch_modify_qp() is just wrong
and causes a WARN_ON() in local_bh_enable().

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-07 15:12:45 -08:00
Or Gerlitz
8154c07fe1 mlx4_core: Get rid of redundant ext_port_cap flags
While doing the work for commit a6f7feae6d ("IB/mlx4: pass SMP
vendor-specific attribute MADs to firmware") we realized that the
firmware would respond on all sorts of vendor-specific MADs.
Therefore commit 97285b7817 ("mlx4_core: Add extended port
capabilities support") adds redundant code into the driver, since
there's no real reaon to maintain the extended capabilities of the
port, as they can be queried on demand (e.g the FDR10 capability).

This patch reverts commit 97285b7817 and removes the check for
extended caps from the mlx4_ib driver port query flow.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-06 17:25:18 -08:00
Hefty, Sean
186834b5de RDMA/ucma: Fix AB-BA deadlock
When we destroy a cm_id, we must purge associated events from the
event queue.  If the cm_id is for a listen request, we also purge
corresponding pending connect requests.  This requires destroying
the cm_id's associated with the connect requests by calling
rdma_destroy_id().  rdma_destroy_id() blocks until all outstanding
callbacks have completed.

The issue is that we hold file->mut while purging events from the
event queue.  We also acquire file->mut in our event handler.  Calling
rdma_destroy_id() while holding file->mut can lead to a deadlock,
since the event handler callback cannot acquire file->mut, which
prevents rdma_destroy_id() from completing.

Fix this by moving events to purge from the event queue to a temporary
list.  We can then release file->mut and call rdma_destroy_id()
outside of holding any locks.

Bug report by Or Gerlitz <ogerlitz@mellanox.com>:

    [ INFO: possible circular locking dependency detected ]
    3.3.0-rc5-00008-g79f1e43-dirty #34 Tainted: G          I

    tgtd/9018 is trying to acquire lock:
     (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0359a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm]

    but task is already holding lock:
     (&file->mut){+.+.+.}, at: [<ffffffffa02470fe>] ucma_free_ctx+0xb6/0x196 [rdma_ucm]

    which lock already depends on the new lock.


    the existing dependency chain (in reverse order) is:

    -> #1 (&file->mut){+.+.+.}:
           [<ffffffff810682f3>] lock_acquire+0xf0/0x116
           [<ffffffff8135f179>] mutex_lock_nested+0x64/0x2e6
           [<ffffffffa0247636>] ucma_event_handler+0x148/0x1dc [rdma_ucm]
           [<ffffffffa035a79a>] cma_ib_handler+0x1a7/0x1f7 [rdma_cm]
           [<ffffffffa0333e88>] cm_process_work+0x32/0x119 [ib_cm]
           [<ffffffffa03362ab>] cm_work_handler+0xfb8/0xfe5 [ib_cm]
           [<ffffffff810423e2>] process_one_work+0x2bd/0x4a6
           [<ffffffff810429e2>] worker_thread+0x1d6/0x350
           [<ffffffff810462a6>] kthread+0x84/0x8c
           [<ffffffff81369624>] kernel_thread_helper+0x4/0x10

    -> #0 (&id_priv->handler_mutex){+.+.+.}:
           [<ffffffff81067b86>] __lock_acquire+0x10d5/0x1752
           [<ffffffff810682f3>] lock_acquire+0xf0/0x116
           [<ffffffff8135f179>] mutex_lock_nested+0x64/0x2e6
           [<ffffffffa0359a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm]
           [<ffffffffa024715f>] ucma_free_ctx+0x117/0x196 [rdma_ucm]
           [<ffffffffa0247255>] ucma_close+0x77/0xb4 [rdma_ucm]
           [<ffffffff810df6ef>] fput+0x117/0x1cf
           [<ffffffff810dc76e>] filp_close+0x6d/0x78
           [<ffffffff8102b667>] put_files_struct+0xbd/0x17d
           [<ffffffff8102b76d>] exit_files+0x46/0x4e
           [<ffffffff8102d057>] do_exit+0x299/0x75d
           [<ffffffff8102d599>] do_group_exit+0x7e/0xa9
           [<ffffffff8103ae4b>] get_signal_to_deliver+0x536/0x555
           [<ffffffff81001717>] do_signal+0x39/0x634
           [<ffffffff81001d39>] do_notify_resume+0x27/0x69
           [<ffffffff81361c03>] retint_signal+0x46/0x83

    other info that might help us debug this:

     Possible unsafe locking scenario:

           CPU0                    CPU1
           ----                    ----
      lock(&file->mut);
                                   lock(&id_priv->handler_mutex);
                                   lock(&file->mut);
      lock(&id_priv->handler_mutex);

     *** DEADLOCK ***

    1 lock held by tgtd/9018:
     #0:  (&file->mut){+.+.+.}, at: [<ffffffffa02470fe>] ucma_free_ctx+0xb6/0x196 [rdma_ucm]

    stack backtrace:
    Pid: 9018, comm: tgtd Tainted: G          I  3.3.0-rc5-00008-g79f1e43-dirty #34
    Call Trace:
     [<ffffffff81029e9c>] ? console_unlock+0x18e/0x207
     [<ffffffff81066433>] print_circular_bug+0x28e/0x29f
     [<ffffffff81067b86>] __lock_acquire+0x10d5/0x1752
     [<ffffffff810682f3>] lock_acquire+0xf0/0x116
     [<ffffffffa0359a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff8135f179>] mutex_lock_nested+0x64/0x2e6
     [<ffffffffa0359a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff8106546d>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffff810654b1>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffffa0359a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffffa024715f>] ucma_free_ctx+0x117/0x196 [rdma_ucm]
     [<ffffffffa0247255>] ucma_close+0x77/0xb4 [rdma_ucm]
     [<ffffffff810df6ef>] fput+0x117/0x1cf
     [<ffffffff810dc76e>] filp_close+0x6d/0x78
     [<ffffffff8102b667>] put_files_struct+0xbd/0x17d
     [<ffffffff8102b5cc>] ? put_files_struct+0x22/0x17d
     [<ffffffff8102b76d>] exit_files+0x46/0x4e
     [<ffffffff8102d057>] do_exit+0x299/0x75d
     [<ffffffff8102d599>] do_group_exit+0x7e/0xa9
     [<ffffffff8103ae4b>] get_signal_to_deliver+0x536/0x555
     [<ffffffff810654b1>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffff81001717>] do_signal+0x39/0x634
     [<ffffffff8135e037>] ? printk+0x3c/0x45
     [<ffffffff8106546d>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffff810654b1>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffff81361803>] ? _raw_spin_unlock_irq+0x2b/0x40
     [<ffffffff81039011>] ? set_current_blocked+0x44/0x49
     [<ffffffff81361bce>] ? retint_signal+0x11/0x83
     [<ffffffff81001d39>] do_notify_resume+0x27/0x69
     [<ffffffff8118a1fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
     [<ffffffff81361c03>] retint_signal+0x46/0x83

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-05 12:27:57 -08:00
Kyle McMartin
bd50f8924c IB/ehca: Fix ilog2() compile failure
I'm getting compile failures building this driver, which I narrowed
down to the ilog2 call in ehca_get_max_hwpage_size...

    ERROR: ".____ilog2_NaN" [drivers/infiniband/hw/ehca/ib_ehca.ko]
    undefined!
    make[1]: *** [__modpost] Error 1
    make: *** [modules] Error 2

The use of shca->hca_cap_mr_pgsize is confusing the compiler, and
resulting in the __builtin_constant_p in ilog2 going insane.

I tried making it take the u32 pgsize as an argument and the expansion
of shca->_pgsize in the caller, but that failed as well.

With this patch in place, the driver compiles on my GCC 4.6.2 here.

Suggested-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Kyle McMartin <kmcmarti@redhat.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-05 10:12:35 -08:00
Or Gerlitz
2e96691c31 IB: Use central enum for speed instead of hard-coded values
The kernel IB stack uses one enumeration for IB speed, which wasn't
explicitly specified in the verbs header file.  Add that enum, and use
it all over the code.

The IB speed/width notation is also used by iWARP and IBoE HW drivers,
which use the convention of rate = speed * width to advertise their
port link rate.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-05 09:25:16 -08:00
Or Gerlitz
89e984e2c2 IB/iser: Post initial receive buffers before sending the final login request
An iser target may send iscsi NO-OP PDUs as soon as it marks the iSER
iSCSI session as fully operative.  This means that there is window
where there are no posted receive buffers on the initiator side, so
it's possible for the iSER RC connection to break because of RNR NAK /
retry errors.  To fix this, rely on the flags bits in the login
request to have FFP (0x3) in the lower nibble as a marker for the
final login request, and post an initial chunk of receive buffers
before sending that login request instead of after getting the login
response.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-05 08:53:05 -08:00
Doug Ledford
d474186f19 IB/iser: Free IB connection resources in the proper place
We allocate the login dma buffers in iser_verbs.c as part of
alloc_ib_conn_resources(), however we are freeing them in
iser_initiator.c as part of iser_free_rx_descriptors().  This is
needlessly confusing.  We have an alloc_rx_descriptors() and it
doesn't alloc something that the free_rx_descriptors() frees, and we
have an alloc_ib_conn_resources() that allocs something not freed by
free_ib_conn_resources().  Clean that up.

Signed-off-by: Doug Ledford <dledford@redhat.com>

[ Fix build error in iser_free_ib_conn_res().  - Or ]

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-05 00:23:27 -08:00
Bart Van Assche
683b159a2e IB/srp: Consolidate repetitive sysfs code
Remove sysfs attributes before removing a target instead of testing
the target state in every sysfs attribute callback method. Note: it is
safe to invoke a sysfs attribute removal method like
device_remove_file() twice on the same attribute.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-27 09:27:57 -08:00
Bart Van Assche
e0bda7d8c3 IB/srp: Use pr_fmt() and pr_err()/pr_warn()
Use pr_fmt() and pr_xxx() instead of more verbose printk() equivalents.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-27 09:26:30 -08:00
Roland Dreier
e9319b0cb0 IB/core: Fix SDR rates in sysfs
Commit 71eeba16 ("IB: Add new InfiniBand link speeds") introduced a bug 
where eg the rate for IB 4X SDR links iss displayed as "8.5 Gb/sec" 
instead of "10 Gb/sec" as it used to be.  Fix that.

Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-27 09:15:08 -08:00
Pablo Neira Ayuso
80d326fab5 netlink: add netlink_dump_control structure for netlink_dump_start()
Davem considers that the argument list of this interface is getting
out of control. This patch tries to address this issue following
his proposal:

struct netlink_dump_control c = { .dump = dump, .done = done, ... };

netlink_dump_start(..., &c);

Suggested by David S. Miller.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-26 14:10:06 -05:00
Eli Cohen
a5bbe892da mlx4: Enforce device max FMR maps in FMR alloc
ConnectX devices have a limit on the number of mappings that can be
done on an FMR before having to call sync_tpt.  The current
mlx4_ib driver reports the limit correctly in max_map_per_fmr in
.query_device(), but mlx4_core doesn't check it when actually
allocating FMRs.

Add a max_fmr_maps field to struct mlx4_caps and enforce this maximum
value on FMR allocations.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-26 01:43:37 -08:00
Eli Cohen
4ba6b8eaa9 IB/mlx4: Set bad_wr for invalid send opcode
If the opcode of a work request exceeds the range of valid opcodes,
return the pointer to the offending work request.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-26 01:37:30 -08:00
Masanari Iida
a776ce7cfc IB/srpt: Fix typo "alocate" -> "allocate"
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:49:37 -08:00
Swapna Thete
0b30704304 IB/mad: Return error response for unsupported MADs
Set up a response with appropriate error status and send it for MADs
that are not supported by a specific class/version.

Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Swapna Thete <swapna.thete@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:47:32 -08:00
Eric Dumazet
6aeaa48b0d IB/ehca: Use kthread_create_on_node()
Since create_comp_task() creates percpu kthread, it makes sense to use
kthread_create_on_node() to get proper NUMA affinity for kthread
stack.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:47:21 -08:00
Mike Marciniszyn
520b3ee705 IB/qib: Avoid filtering LID on SMA portinfo
The current get portinfo handling filters the LID being sent,
changing zero to 0xffff.

This causes OpenSM to log excessive warning messages.

Reviewed-by: Edward Mascarenhas <edward.mascarenhas@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:45:50 -08:00
Mike Marciniszyn
a778f3fddc IB/qib: Add logic for affinity hint
Call irq_set_affinity_hint() to give userspace programs such as
irqbalance the information to be able to distribute qib interrupts
appropriately.

The logic allocates all non-receive interrupts to the first CPU local
to the HCA.  Receive interrupts are allocated round robin starting
with the second CPU local to the HCA with potential wrap back to the
second CPU.

This patch also adds a refinement to the name registered for MSI-X
interrupts so that user level scripts can determine the device
associated with the IRQs when there are multiple HCAs with a
potentially different set of local CPUs.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:45:49 -08:00
Tatyana Nikolova
8dd87fba93 RDMA/nes: Fixes for sparse endianness warnings
Fix endianness problems detect by sparse, introduced with the enhanced
MPA patch.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:45:37 -08:00
Kumar Sanghvi
91018f8632 RDMA/cxgb4: Add missing peer2peer check in MPAv2 code
Don't worry about p2p_type if peer2peer itself is not requested in the
first place.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:45:02 -08:00
Andy Grover
c8e31f26fe target: Add SCF_SCSI_TMR_CDB usage and drop se_tmr_req_cache
Change the test for if a cmd is a tmr request to checking if
SCF_SCSI_TMR_CDB (a new flag) is set in cmd->se_cmd_flags.

Also remove se_tmr_req_cache usage in favor of kzalloc usage,
and make core_tmr_alloc_req() return int + setup se_cmd->se_tmr_req
directly and fix up various fabric module usages

Cc: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-02-25 14:37:47 -08:00
Christoph Hellwig
7d680f3b74 target: replace various cmd flags with a transport state
Replace various atomic_ts used as flags in struct se_cmd with a single
transport_state bitmap that requires t_state_lock to be held for modifications.

In the target core that assumption generally is true, but some recently added
code in the SRP target had to grow new lock calls.  I can't say I like the way
how it messes with the command state directly, but let's leave that for later.

(Re-add missing ib_srpt.c changes that nab dropped..)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-02-25 14:37:45 -08:00
David S. Miller
d5ef8a4d87 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/infiniband/hw/nes/nes_cm.c

Simple whitespace conflict.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-10 23:32:28 -05:00
Linus Torvalds
8df54d622a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Quoth David:

1) GRO MAC header comparisons were ethernet specific, breaking other
   link types.  This required a multi-faceted fix to cure the originally
   noted case (Infiniband), because IPoIB was lying about it's actual
   hard header length.  Thanks to Eric Dumazet, Roland Dreier, and
   others.

2) Fix build failure when INET_UDP_DIAG is built in and ipv6 is modular.
   From Anisse Astier.

3) Off by ones and other bug fixes in netprio_cgroup from Neil Horman.

4) ipv4 TCP reset generation needs to respect any network interface
   binding from the socket, otherwise route lookups might give a
   different result than all the other segments received.  From Shawn
   Lu.

5) Fix unintended regression in ipv4 proxy ARP responses, from Thomas
   Graf.

6) Fix SKB under-allocation bug in sh_eth, from Yoshihiro Shimoda.

7) Revert skge PCI mapping changes that are causing crashes for some
   folks, from Stephen Hemminger.

8) IPV4 route lookups fill in the wildcarded fields of the given flow
   lookup key passed in, which is fine most of the time as this is
   exactly what the caller's want.  However there are a few cases that
   want to retain the original flow key values afterwards, so handle
   those cases properly.  Fix from Julian Anastasov.

9) IGB/IXGBE VF lookup bug fixes from Greg Rose.

10) Properly null terminate filename passed to ethtool flash device
    method, from Ben Hutchings.

11) S3 resume fix in via-velocity from David Lv.

12) Fix double SKB free during xmit failure in CAIF, from Dmitry
    Tarnyagin.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (72 commits)
  net: Don't proxy arp respond if iif == rt->dst.dev if private VLAN is disabled
  ipv4: Fix wrong order of ip_rt_get_source() and update iph->daddr.
  netprio_cgroup: fix wrong memory access when NETPRIO_CGROUP=m
  netprio_cgroup: don't allocate prio table when a device is registered
  netprio_cgroup: fix an off-by-one bug
  bna: fix error handling of bnad_get_flash_partition_by_offset()
  isdn: type bug in isdn_net_header()
  net: Make qdisc_skb_cb upper size bound explicit.
  ixgbe: ethtool: stats user buffer overrun
  ixgbe: dcb: up2tc mapping lost on disable/enable CEE DCB state
  ixgbe: do not update real num queues when netdev is going away
  ixgbe: Fix broken dependency on MAX_SKB_FRAGS being related to page size
  ixgbe: Fix case of Tx Hang in PF with 32 VFs
  ixgbe: fix vf lookup
  igb: fix vf lookup
  e1000: add dropped DMA receive enable back in for WoL
  gro: more generic L2 header check
  IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses
  zd1211rw: firmware needs duration_id set to zero for non-pspoll frames
  net: enable TC35815 for MIPS again
  ...
2012-02-10 14:18:46 -08:00
Masanari Iida
7367d99b30 SRP: Fix typo in ib_srpt.c
Correct spelling "alocate" to "allocate" in
drivers/infiniband/ulp/srpt/ib_srpt.c

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-02-09 23:09:54 +01:00
Roland Dreier
936d7de3d7 IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses
Commit a0417fa3a1 ("net: Make qdisc_skb_cb upper size bound
explicit.") made it possible for a netdev driver to use skb->cb
between its header_ops.create method and its .ndo_start_xmit
method.  Use this in ipoib_hard_header() to stash away the LL address
(GID + QPN), instead of the "ipoib_pseudoheader" hack.  This allows
IPoIB to stop lying about its hard_header_len, which will let us fix
the L2 check for GRO.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 18:26:54 -05:00
Roland Dreier
377cb4f9e7 IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses
Commit a0417fa3a1 ("net: Make qdisc_skb_cb upper size bound
explicit.") made it possible for a netdev driver to use skb->cb
between its header_ops.create method and its .ndo_start_xmit
method.  Use this in ipoib_hard_header() to stash away the LL address
(GID + QPN), instead of the "ipoib_pseudoheader" hack.  This allows
IPoIB to stop lying about its hard_header_len, which will let us fix
the L2 check for GRO.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 15:50:01 -05:00
Jesper Juhl
715252d419 IB/srpt: Don't return freed pointer from srpt_alloc_ioctx_ring()
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-06 08:57:11 -08:00
Dan Carpenter
3af336376f IB/srpt: Fix ERR_PTR() vs. NULL checking confusion
transport_init_session() and target_fabric_configfs_init() don't
return NULL pointers, they only return ERR_PTRs or valid pointers.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-03 09:07:00 -08:00
Jesper Juhl
ebfded8c4b IB/srpt: Remove unneeded <linux/version.h> include
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-02 12:55:59 -08:00
Roland Dreier
f225066b64 IB/srpt: Use ARRAY_SIZE() instead of open-coding
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-02 12:55:58 -08:00
Roland Dreier
486d8b9f88 IB/srpt: Use DEFINE_SPINLOCK()/LIST_HEAD()
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-02 12:55:58 -08:00
Roland Dreier
f36ae34238 Merge branches 'cma', 'ipath', 'misc', 'mlx4', 'nes' and 'qib' into for-next 2012-01-30 16:18:21 -08:00
Tatyana Nikolova
c5488c571f RDMA/nes: Copyright update
Update copyright information in the source files.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-30 16:18:07 -08:00
Jack Morgenstein
a6f7feae6d IB/mlx4: pass SMP vendor-specific attribute MADs to firmware
In the current code, vendor-specific MADs (e.g with the FDR-10
attribute) are silently dropped by the driver, resulting in timeouts
at the sending side and inability to query/configure the relevant
feature.  However, the ConnectX firmware is able to handle such MADs.
For unsupported attributes, the firmware returns a GET_RESPONSE MAD
containing an error status.

For example, for a FDR-10 node with LID 11:

    # ibstat mlx4_0 1

    CA: 'mlx4_0'
    Port 1:
    State: Active
    Physical state: LinkUp
    Rate: 40 (FDR10)
    Base lid: 11
    LMC: 0
    SM lid: 24
    Capability mask: 0x02514868
    Port GUID: 0x0002c903002e65d1
    Link layer: InfiniBand

Extended Port Query (EPI) vendor mad timeouts before the patch:

    # smpquery MEPI 11 -d

    ibwarn: [4196] smp_query_via: attr 0xff90 mod 0x0 route Lid 11
    ibwarn: [4196] _do_madrpc: retry 1 (timeout 1000 ms)
    ibwarn: [4196] _do_madrpc: retry 2 (timeout 1000 ms)
    ibwarn: [4196] _do_madrpc: timeout after 3 retries, 3000 ms
    ibwarn: [4196] mad_rpc: _do_madrpc failed; dport (Lid 11)
    smpquery: iberror: [pid 4196] main: failed: operation EPI: ext port info query failed

EPI query works OK with the patch:

    # smpquery MEPI 11 -d

    ibwarn: [6548] smp_query_via: attr 0xff90 mod 0x0 route Lid 11
    ibwarn: [6548] mad_rpc: data offs 64 sz 64
    mad data
    0000 0000 0000 0001 0000 0001 0000 0001
    0000 0000 0000 0000 0000 0000 0000 0000
    0000 0000 0000 0000 0000 0000 0000 0000
    0000 0000 0000 0000 0000 0000 0000 0000
    # Ext Port info: Lid 11 port 0
    StateChangeEnable:...............0x00
    LinkSpeedSupported:..............0x01
    LinkSpeedEnabled:................0x01
    LinkSpeedActive:.................0x01

Signed-off-by: Jack Morgenstein <jackm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Ira Weiny <weiny2@llnl.gov>
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-30 16:15:17 -08:00
Tatyana Nikolova
4a4b03f4ef RDMA/nes: Fix fast memory registration opcode
Fix fast memory registration opcode in local invalidate completion.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Donald Wood <Donald.E.Wood@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 10:15:13 -08:00
Tatyana Nikolova
94f622bdac RDMA/nes: Fix fast memory registration length
Zero high order word of fast memory registration (FMR) length field.
FMR length field is 32 bits, so high word should always be zero.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Donald Wood <Donald.E.Wood@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 10:14:51 -08:00
Sean Hefty
9ced69ca52 RDMA/ucma: Discard all events for new connections until accepted
After reporting a new connection request to user space, the rdma_ucm
will discard subsequent events until the user has associated a user
space idenfier with the kernel cm_id.  This is needed to avoid
reporting a reject/disconnect event to the user for a request that
they may not have processed.

The user space identifier is set once the user tries to accept the
connection request.  However, the following race exists in ucma_accept():

	ctx->uid = cmd.uid;
	<events may be reported now>
	ret = rdma_accept(ctx->cm_id, ...);

Once ctx->uid has been set, new events may be reported to the user.
While the above mentioned race is avoided, there is an issue that the
user _may_ receive a reject/disconnect event if rdma_accept() fails,
depending on when the event is processed.  To simplify the use of
rdma_accept(), discard all events unless rdma_accept() succeeds.

This problem was discovered based on questions from Roland Dreier
<roland@purestorage.com>.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 10:07:48 -08:00
Mike Marciniszyn
b6bfefb041 IB/qib: Roll back PCIe tuning change
Commit 8d4548f2b ("IB/qib: Default some module parameters optimally")
introduced an issue with older root complexes.  They cannot handle the
pcie_caps of 0x51 (MaxReadReq 4096, MaxPayload=256).

A typical diagnostic in this situation reported by syslog contains
the text:

  [PCIe Poisoned TLP][Send DMA memory read]

Restore the module paramter default to zero with will avoid any
changes in the root complex.

Reviewed-by: Mark Debbage <mark.debbage@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 10:03:38 -08:00
Julia Lawall
0f3696eb21 IB/qib: Use GFP_ATOMIC when locks are held
alloc_dummy_hdrq() is called with locks held and thus should not use
GFP_KERNEL.

The semantic patch that makes this report is available in
scripts/coccinelle/locks/call_kern.cocci.

Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Acked-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 09:59:22 -08:00
Roland Dreier
7525c85be0 RDMA/nes: Add missing rcu_read_unlock() in nes_addr_resolve_neigh()
Make sure all exit paths from this function unlock everything.

Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 09:54:30 -08:00
Tatyana Nikolova
81f99dcc93 RDMA/nes: Fix for sending MPA reject frame
Set a reject flag, when sending MPA reject message to inform the peer
that the application has rejected the connection.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 09:50:48 -08:00
Dan Carpenter
ef5352875a IB/ipath: Calling PTR_ERR() on right variable in create_file()
"dentry" is a valid pointer.  "*dentry" was intended.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 09:48:27 -08:00
Bernd Schubert
e47e321a35 RDMA/core: Fix kernel panic by always initializing qp->usecnt
We have just been investigating kernel panics related to
cq->ibcq.event_handler() completion calls.  The problem is that
ib_destroy_qp() fails with -EBUSY.

Further investigation revealed qp->usecnt is not initialized.  This
counter was introduced in linux-3.2 by commit 0e0ec7e063
("RDMA/core: Export ib_open_qp() to share XRC TGT QPs") but it only
gets initialized for IB_QPT_XRC_TGT, but it is checked in
ib_destroy_qp() for any QP type.

Fix this by initializing qp->usecnt for every QP we create.

Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: Sven Breuner <sven.breuner@itwm.fraunhofer.de>

[ Initialize qp->usecnt in uverbs too.  - Sean ]

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-27 09:20:10 -08:00
David Miller
e55684fadb infiniband: nes: Convert nes_addr_resolve_neigh() over to dst_neigh_lookup().
Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-25 21:30:37 -05:00
David Miller
64b7007eb9 infiniband: cxgb4: Convert import_ep() over to dst_neigh_lookup().
Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-25 21:30:37 -05:00
David Miller
02b619555a infiniband: Convert dst_fetch_ha() over to dst_neigh_lookup().
Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-25 21:30:36 -05:00
Linus Torvalds
f59e842fc0 Merge branch 'for-next-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
* 'for-next-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  ib_srpt: Initial SRP Target merge for v3.3-rc1
2012-01-18 16:29:42 -08:00
Rusty Russell
90ab5ee941 module_param: make bool parameters really bool (drivers & misc)
module_param(bool) used to counter-intuitively take an int.  In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option.  For this version
it'll simply give a warning, but it'll break next kernel version.

Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:20 +10:30
Rusty Russell
69116f279a module_param: avoid bool abuse, add bint for special cases.
For historical reasons, we allow module_param(bool) to take an int (or
an unsigned int).  That's going away.

A few drivers really want an int: they set it to -1 and a parameter
will set it to 0 or 1.  This sucks: reading them from sysfs will give
'Y' for both -1 and 1, but if we change it to an int, then the users
might be broken (if they did "param" instead of "param=1").

Use a new 'bint' parser for them.

(ntfs has a different problem: it needs an int for debug_msgs because
it's also exposed via sysctl.)

Cc: Steve Glendinning <steve.glendinning@smsc.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Guenter Roeck <guenter.roeck@ericsson.com>
Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Cc: Christoph Raisch <raisch@de.ibm.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: linux390@de.ibm.com
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: lm-sensors@lm-sensors.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linux-ntfs-dev@lists.sourceforge.net
Cc: alsa-devel@alsa-project.org
Acked-by: Takashi Iwai <tiwai@suse.de> (For the sound part)
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com> (For the hwmon driver)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:17 +10:30