Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc).
Here is a short excerpt of the semantic patch performing
this transformation:
@@
type T2;
expression x;
identifier f,fld;
expression E;
expression E1,E2;
expression e1,e2,e3,y;
statement S;
@@
x =
- kmalloc
+ kzalloc
(E1,E2)
... when != \(x->fld=E;\|y=f(...,x,...);\|f(...,x,...);\|x=E;\|while(...) S\|for(e1;e2;e3) S\)
- memset((T2)x,0,E1);
@@
expression E1,E2,E3;
@@
- kzalloc(E1 * E2,E3)
+ kcalloc(E1,E2,E3)
[akpm@linux-foundation.org: get kcalloc args the right way around]
Signed-off-by: Yoann Padioleau <padator@wanadoo.fr>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Bryan Wu <bryan.wu@analog.com>
Acked-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Dave Airlie <airlied@linux.ie>
Acked-by: Roland Dreier <rolandd@cisco.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Acked-by: Dmitry Torokhov <dtor@mail.ru>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Acked-by: Pierre Ossman <drzeus-list@drzeus.cx>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Greg KH <greg@kroah.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current code sets size0 to 0 at the start of work request posting
functions and then handles size0 == 0 specially within the loop over
work requests. Change this so size0 is set along with f0 the first
time through the loop (when nreq == 0). This makes the code easier to
understand by making it clearer that f0 and size0 are always
initialized if nreq != 0 without having to know that size0 == 0
implies nreq == 0.
Also annotate size0 with uninitialized_var() so that this doesn't
introduce a new compiler warning.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Factor code to set UD entries out of the work request posting
functions into inline functions set_tavor_ud_seg() and
set_arbel_ud_seg(). This doesn't change the generated code in any
significant way, and makes the source easier on the eyes.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Factor code to set remote address and atomic segment entries out of the
work request posting functions into inline functions set_raddr_seg()
and set_atomic_seg(). This doesn't change the generated code in any
significant way, and makes the source easier on the eyes.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Factor code to set remote address, atomic and datagram segments out of
mlx4_ib_post_send() into small helper functions. This doesn't change
the generated code in any significant way, and makes the source easier
on the eyes.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Factor code to set data segment entries out of mlx4_ib_post_send()
into set_data_seg(). This cleans up the code and lets the compiler do
a better job -- on x86_64:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-16 (-16)
function old new delta
mlx4_ib_post_send 1598 1582 -16
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Factor code to set data segment entries out of the work request
posting functions into inline functions mthca_set_data_seg() and
mthca_set_data_seg_inval(). This makes the code more readable and
also allows the compiler to do a better job -- on x86_64:
add/remove: 0/0 grow/shrink: 0/6 up/down: 0/-69 (-69)
function old new delta
mthca_arbel_post_srq_recv 373 369 -4
mthca_arbel_post_receive 570 562 -8
mthca_tavor_post_srq_recv 520 508 -12
mthca_tavor_post_send 1344 1330 -14
mthca_arbel_post_send 1481 1467 -14
mthca_tavor_post_receive 792 775 -17
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Return the receive queue sizes for both userspace QPs and kernel Qps
(not just kernel QPs) from mlx4_ib_query_qp(). Also zero the send
queue sizes for userspace QPs to avoid a possible information leak,
and set the max_inline_data for kernel QPs to 0 since inline sends are
not supported for kernel QPs.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit 9db48926 ("drivers/infiniband/hw/mthca/mthca_qp: kill uninit'd
var warning") added "= 0" to the declarations of f0 to shut up gcc
warnings. However, there's no point in making the code bigger by
initializing f0 to a random value just to get rid of a warning;
setting f0 to 0 is no safer than just using uninitialized_var(), which
documents the situation better and gives smaller code too. For example,
on x86_64:
add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-16 (-16)
function old new delta
mthca_tavor_post_send 1352 1344 -8
mthca_arbel_post_send 1489 1481 -8
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Make some functions that are only used in a single .c file static. In
addition to being a cleanup, this shrinks the generated code. On x86_64:
add/remove: 1/3 grow/shrink: 2/1 up/down: 4777/-4956 (-179)
function old new delta
handle_errors - 3994 +3994
__verbs_timer 42 710 +668
ipath_do_ruc_send 2131 2246 +115
ipath_no_bufs_available 136 - -136
ipath_disarm_senderrbufs 639 - -639
ipath_ib_timer 658 - -658
ipath_intr 5878 2355 -3523
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When warning about out-of-date firmware, current mthca code messes up
the formatting of the version if the subminor doesn't have three
digits. It doesn't fill the field with 0s so we end up with:
ib_mthca 0000:0b:00.0: HCA FW version 1.1. 0 is old (1.2. 0 is current).
Change the format from "%3d" to "%03d" to get the right thing printed.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The mthca driver supports both MSI and MSI-X. However, MSI-X works with
all hardware that the driver handles, and provides a superset of what
MSI does, so there's no point in having code for both. Schedule MSI
support for removal in 2008 to give anyone who actually needs MSI and
who can't use MSI time to speak up.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Run the existing ehca code through checkpatch.pl and clean up the
worst of the coding style violations.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Split ehca_set_pagebuf() into three functions depending on MR type
(phys/user/fast) and remove superfluous ehca_set_pagebuf_1().
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Rename struct ehca_mr fields to clearly distinguish between kernel
and HW page size.
- Sort struct ehca_mr_pginfo into a common part and a union containing
specific fields for physical, user and fast MR
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Instead of one error mapping function for each potential error source
in ehca_mrmw.c, use a centralized function that handles all cases,
saving a three-figure line count.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Autodetection was missing a few HW revisions, causing certain eHCA1
revisions to be treated like eHCA2. Fixed.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When clearing the ib_ah_attr parameter in to_ib_ah_attr(), use sizeof
*ib_ah_attr instead of sizeof *path. This is the same bug as was
fixed for mthca in 99d4f22e ("IB/mthca: Use correct structure size in
call to memset()"), but the code was cut and pasted into mlx4 before the
fix was merged.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When a QP is in the INIT state, the sched_queue field hasn't been given
to the firmware yet, so the firmware cannot return the value when the QP
is queried. To handle this, use the port number that is saved in the
driver's QP data structure.
Found by Dotan Barak and Yaron Gepstein of Mellanox.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Correct the mask used to get the flow label, since the field is 20 bits,
not 24 bits.
Found by Dotan Barak and Yaron Gepstein of Mellanox.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
drivers/infiniband/hw/mthca/mthca_qp.c: In function
‘mthca_tavor_post_send’:
drivers/infiniband/hw/mthca/mthca_qp.c:1594: warning: ‘f0’ may be used
uninitialized in this function
drivers/infiniband/hw/mthca/mthca_qp.c: In function
‘mthca_arbel_post_send’:
drivers/infiniband/hw/mthca/mthca_qp.c:1949: warning: ‘f0’ may be used
uninitialized in this function
Initializing 'f0' is not strictly necessary in either case, AFAICS.
I was considering use of uninitialized_var(), but looking at the
complex flow of control in each function, I feel it is wiser and
safer to simply zero the var and be certain of ourselves.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (76 commits)
IB: Update MAINTAINERS with Hal's new email address
IB/mlx4: Implement query SRQ
IB/mlx4: Implement query QP
IB/cm: Send no match if a SIDR REQ does not match a listen
IB/cm: Fix handling of duplicate SIDR REQs
IB/cm: cm_msgs.h should include ib_cm.h
IB/cm: Include HCA ACK delay in local ACK timeout
IB/cm: Use spin_lock_irq() instead of spin_lock_irqsave() when possible
IB/sa: Make sure SA queries use default P_Key
IPoIB: Recycle loopback skbs instead of freeing and reallocating
IB/mthca: Replace memset(<addr>, 0, PAGE_SIZE) with clear_page(<addr>)
IPoIB/cm: Fix warning if IPV6 is not enabled
IB/core: Take sizeof the correct pointer when calling kmalloc()
IB/ehca: Improve latency by unlocking after triggering the hardware
IB/ehca: Notify consumers of LID/PKEY/SM changes after nondisruptive events
IB/ehca: Return QP pointer in poll_cq()
IB/ehca: Change idr spinlocks into rwlocks
IB/ehca: Refactor sync between completions and destroy_cq using atomic_t
IB/ehca: Lock renaming, static initializers
IB/ehca: Report RDMA atomic attributes in query_qp()
...
Instead of all drivers reading pci config space to get the revision
ID, they can now use the pci_device->revision member.
This exposes some issues where drivers where reading a word or a dword
for the revision number, and adding useless error-handling around the
read. Some drivers even just read it for no purpose of all.
In devices where the revision ID is being copied over and used in what
appears to be the equivalent of hotpath, I have left the copy code
and the cached copy as not to influence the driver's performance.
Compile tested with make all{yes,mod}config on x86_64 and i386.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kick the hardware before unlocking the send/receive queue to overlap
processing a little more.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When firmware reports a nondisruptive port configuration change event,
previous versions of the eHCA driver didn't forward the event to consumers
like IPoIB. Add code that determines the type of configuration change by
comparing old and new port attributes and reports it.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This eliminates lock contention among IRQs as well as the need to
disable IRQs around idr_find, because there are no IRQ writers.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- ehca_cq.nr_events is made an atomic_t, eliminating a lot of locking.
- The CQ is removed from the CQ idr first now to make sure no more
completions are scheduled on that CQ. The "wait for all completions to
end" code becomes much simpler this way.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Rename all spinlock flags to "flags", matching the vast majority of kernel
code.
- Move hcall_lock into the only file it's used in.
- Replaced spin_lock_init() and friends with static initializers for
global variables.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Support SRQs on eHCA2. Since an SRQ is a QP for eHCA2, a lot of code
(structures, create, destroy, post_recv) can be shared between QP and SRQ.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Replace init_qp_queues() by a shorter init_qp_queue(), eliminating
duplicate code.
- hipz_h_alloc_resource_qp() doesn't need a pointer to struct ehca_qp any
longer. All input and output data is transferred through the parms
parameter.
- Change the interface to also support SRQ.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In preparation for support of new eHCA2 features, change adapter probing:
- Hardware level is changed to encode major and minor chip version
- Hardware capabilities are queried from the firmware
- The maximum MTU is queried from the firmware instead of assuming a
fixed value
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Return the PortGUID of the correct port when responding to a NodeInfo
query. Returning the SystemImageGUID causes issues when there are
multiple HCAs in a single system.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The changeset 3859e39d ("IB/ipath: Support larger IB_QP_MAX_DEST_RD_ATOMIC
and IB_QP_MAX_QP_RD_ATOMIC") added support for larger RD_ATOMIC values,
but it failed to take out the stricter checks that were before these and
hence had no effect. This patch takes out the bogus checks....
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
All too often, interrupts do not get enabled for our card due to BIOS
misconfiguration and other issues. This patch checks for that
condition on startup and warns the user. This patch is based on work
(check LID availability) by Robert Walsh.
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The default calculation for the number of send buffers to allocate to
the kernel was too high for the PCIe version of the chip thus leaving
fewer than desired send buffers for user MPI applications.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>