Commit Graph

45 Commits (a1531acd43310a7e4571d52e8846640667f4c74b)

Author SHA1 Message Date
FUJITA Tomonori 8d8bb39b9e dma-mapping: add the device argument to dma_mapping_error()
Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
architecture does:

This enables us to cleanly fix the Calgary IOMMU issue that some devices
are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423).

I think that per-device dma_mapping_ops support would be also helpful for
KVM people to support PCI passthrough but Andi thinks that this makes it
difficult to support the PCI passthrough (see the above thread).  So I
CC'ed this to KVM camp.  Comments are appreciated.

A pointer to dma_mapping_ops to struct dev_archdata is added.  If the
pointer is non NULL, DMA operations in asm/dma-mapping.h use it.  If it's
NULL, the system-wide dma_ops pointer is used as before.

If it's useful for KVM people, I plan to implement a mechanism to register
a hook called when a new pci (or dma capable) device is created (it works
with hot plugging).  It enables IOMMUs to set up an appropriate
dma_mapping_ops per device.

The major obstacle is that dma_mapping_error doesn't take a pointer to the
device unlike other DMA operations.  So x86 can't have dma_mapping_ops per
device.  Note all the POWER IOMMUs use the same dma_mapping_error function
so this is not a problem for POWER but x86 IOMMUs use different
dma_mapping_error functions.

The first patch adds the device argument to dma_mapping_error.  The patch
is trivial but large since it touches lots of drivers and dma-mapping.h in
all the architecture.

This patch:

dma_mapping_error() doesn't take a pointer to the device unlike other DMA
operations.  So we can't have dma_mapping_ops per device.

Note that POWER already has dma_mapping_ops per device but all the POWER
IOMMUs use the same dma_mapping_error function.  x86 IOMMUs use device
argument.

[akpm@linux-foundation.org: fix sge]
[akpm@linux-foundation.org: fix svc_rdma]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix bnx2x]
[akpm@linux-foundation.org: fix s2io]
[akpm@linux-foundation.org: fix pasemi_mac]
[akpm@linux-foundation.org: fix sdhci]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix sparc]
[akpm@linux-foundation.org: fix ibmvscsi]
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:03 -07:00
Divy Le Ray b47385bd4f cxgb3 - Add LRO support
Add LRO support.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-22 06:34:13 -04:00
Divy Le Ray 7385ecf339 cxgb3 - Add page support to jumbo frame Rx queue
Add page support to Jumbo frame Rx queues.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-22 06:34:11 -04:00
Divy Le Ray b1fb1f280d cxgb3 - Fix dma mapping error path
Take potential dma mapping errors in account.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-22 06:34:10 -04:00
Divy Le Ray 204e2f98c2 cxgb3 - fix EEH
Reset the chip when the PCI link goes down.
Preserve the napi structure when a sge qset's resources are freed.
Replay only HW initialization when the chip comes out of reset.

Signed-off-by: Divy Le ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-13 01:31:37 -04:00
Roland Dreier b1186dee3e cxgb3: Fix lockdep problems with sge.reg_lock
Using iWARP with a Chelsio T3 NIC generates the following lockdep warning:

    =================================
    [ INFO: inconsistent lock state ]
    2.6.25-rc6 #50
    ---------------------------------
    inconsistent {softirq-on-W} -> {in-softirq-W} usage.
    swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
     (&adap->sge.reg_lock){-+..}, at: [<ffffffff880e5ee2>] cxgb_offload_ctl+0x3af/0x507 [cxgb3]

The problem is that reg_lock is used with plain spin_lock() in
drivers/net/cxgb3/sge.c but is used with spin_lock_irqsave() in
drivers/net/cxgb3/cxgb3_offload.c.  This is technically a false
positive, since the uses in sge.c are only in the initialization and
cleanup paths and cannot overlap with any use in interrupt context.

The best fix is probably just to use spin_lock_irq() with reg_lock in
sge.c.  Even though it's not strictly required for correctness, it
avoids triggering lockdep and the extra overhead of disabling
interrupts is not important at all in the initialization and cleanup
slow paths.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-25 23:42:05 -04:00
Divy Le Ray cd7e903440 cxgb3: Fix transmit queue stop mechanism
The last change in the Tx queue stop mechanism opens a window
where the Tx queue might be stopped after pending credits
returned.

Tx credits are returned via a control message generated by the HW.
It returns tx credits on demand, triggered by a completion bit
set in selective transmit packet headers.

The current code can lead to the Tx queue stopped
with all pending credits returned, and the current frame
not triggering a credit return. The Tx queue will then never be
awaken.

The driver could alternatively request a completion for packets
that stop the queue. It's however safer at this point to go back
to the pre-existing behaviour.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-17 08:07:01 -04:00
Krishna Kumar a8cc21f646 Optimize cxgb3 xmit path (a bit)
1. Add common code for stopping queue.
	2. No need to call netif_stop_queue followed by netif_wake_queue (and
	   infact a netif_start_queue could have been used instead), instead
	   call stop_queue if required, and remove code under USE_GTS macro.
	3. There is no need to check for netif_queue_stopped, as the network
	   core guarantees that for us (I am sure every driver could remove
	   that check, eg e1000 - I have tested that path a few billion times
	   with about a few hundred thousand qstops but the condition never
	   hit even once).

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-02-11 10:44:28 -05:00
Roland Dreier 7b9b09436b cxgb3: Remove incorrect __devinit annotations
When PCI error recovery was added to cxgb3, a function t3_io_slot_reset()
was added.  This function can call back into t3_prep_adapter() at any
time, so t3_prep_adapter() can no longer be marked __devinit.
This patch removes the __devinit annotation from t3_prep_adapter() and
all the functions that it calls, which fixes

    WARNING: drivers/net/cxgb3/built-in.o(.text+0x2427): Section mismatch in reference from the function t3_io_slot_reset() to the function .devinit.text:t3_prep_adapter()

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-03 04:28:35 -08:00
Al Viro 05e5c11653 annotate cxgb3
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:10:30 -08:00
Divy Le Ray bc4b6b5269 cxgb3 - Fix EEH, missing softirq blocking
set_pci_drvdata() stores a pointer to the adapter,
not the net device.
Add missing softirq blocking in t3_mgmt_tx.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:07:22 -08:00
Divy Le Ray b881955b7d cxgb3 - parity initialization for T3C adapters.
Add parity initialization for T3C adapters.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:07:22 -08:00
Divy Le Ray afefce66a5 cxgb3 - Fix I/O synchronization
Synchronize memory access before ringing
the Tx door bell.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:04:12 -08:00
Divy Le Ray 23561c9447 cxgb3 - fix interaction with pktgen
Do not use skb->cb to stash unmap info,
save the info to the descriptor state.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:04:09 -08:00
Jeff Garzik 7c2399756a [SPARC, XEN, NET/CXGB3] use irq_handler_t where appropriate
Rather than hand-rolling our own prototype, make the code more
future-proof by using the standard irq_handler_t typedef.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-23 19:53:17 -04:00
Stephen Hemminger 9265fabf0d cxgb3 sparse warning fixes
Fix warnings from sparse related to shadowed variables and routines
that should be declared static.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:55:29 -07:00
Al Viro fb8e4444cc cxgb3: trivial endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:52:06 -07:00
Divy Le Ray 27186dc325 cxgb3 - use immediate data for offload Tx
Send small TX_DATA work requests as immediate data even when
there are fragments. this avoids doing multiple DMAs for
small fragmented packets.
The driver already implements this optimization for small
contiguous packets.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:50:48 -07:00
Divy Le Ray 6e3f03b72c cxgb3 - SGE doorbell overflow warning
Log doorbell Fifo overflow

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:50:47 -07:00
Stephen Hemminger bea3348eef [NET]: Make NAPI polling independent of struct net_device objects.
Several devices have multiple independant RX queues per net
device, and some have a single interrupt doorbell for several
queues.

In either case, it's easier to support layouts like that if the
structure representing the poll is independant from the net
device itself.

The signature of the ->poll() call back goes from:

	int foo_poll(struct net_device *dev, int *budget)

to

	int foo_poll(struct napi_struct *napi, int budget)

The caller is returned the number of RX packets processed (or
the number of "NAPI credits" consumed if you want to get
abstract).  The callee no longer messes around bumping
dev->quota, *budget, etc. because that is all handled in the
caller upon return.

The napi_struct is to be embedded in the device driver private data
structures.

Furthermore, it is the driver's responsibility to disable all NAPI
instances in it's ->stop() device close handler.  Since the
napi_struct is privatized into the driver's private data structures,
only the driver knows how to get at all of the napi_struct instances
it may have per-device.

With lots of help and suggestions from Rusty Russell, Roland Dreier,
Michael Chan, Jeff Garzik, and Jamal Hadi Salim.

Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.

[ Ported to current tree and all drivers converted.  Integrated
  Stephen's follow-on kerneldoc additions, and restored poll_list
  handling to the old style to fix mutual exclusion issues.  -DaveM ]

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:47:45 -07:00
Divy Le Ray 5fbf816fe7 cxgb3 - Fix dev->priv usage
cxgb3 used netdev_priv() and dev->priv for different purposes.
In 2.6.23, netdev_priv() == dev->priv, cxgb3 needs a fix.
This patch is a partial backport of Dave Miller's changes in the
net-2.6.24 git branch.

Without this fix, cxgb3 crashes on 2.6.23.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-08-31 07:29:08 -04:00
Divy Le Ray cf992af561 cxgb3 - sge page management
Streamline sge page management.
Fix dma mappings when buffers are recycled.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-08 22:16:39 -04:00
Divy Le Ray 890de33283 cxgb3 - fix netpoll hanlder
Fix netpoll handler to work with line interrupt, msi and msi-x.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-06-20 19:16:58 -04:00
Divy Le Ray e360b5628f cxgb3 - fix skb->dev dereference
eth_type_trans() now sets skb->dev.
References to skb->dev should happen after it is called.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-06-20 19:16:58 -04:00
Arnaldo Carvalho de Melo 27d7ff46a3 [SK_BUFF]: Introduce skb_copy_to_linear_data{_offset}
To clearly state the intent of copying to linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-04-25 22:28:29 -07:00
Arnaldo Carvalho de Melo d626f62b11 [SK_BUFF]: Introduce skb_copy_from_linear_data{_offset}
To clearly state the intent of copying from linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-04-25 22:28:23 -07:00
Arnaldo Carvalho de Melo 27a884dc3c [SK_BUFF]: Convert skb->tail to sk_buff_data_t
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)

Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:28 -07:00
Arnaldo Carvalho de Melo 9c70220b73 [SK_BUFF]: Introduce skb_transport_header(skb)
For the places where we need a pointer to the transport header, it is
still legal to touch skb->h.raw directly if just adding to,
subtracting from or setting it to another layer header.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:31 -07:00
Arnaldo Carvalho de Melo aa8223c7bb [SK_BUFF]: Introduce tcp_hdr(), remove skb->h.th
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:26 -07:00
Arnaldo Carvalho de Melo ea2ae17d64 [SK_BUFF]: Introduce skb_transport_offset()
For the quite common 'skb->h.raw - skb->data' sequence.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:16 -07:00
Arnaldo Carvalho de Melo badff6d01a [SK_BUFF]: Introduce skb_reset_transport_header(skb)
For the common, open coded 'skb->h.raw = skb->data' operation, so that we can
later turn skb->h.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple cases:

skb->h.raw = skb->data;
skb->h.raw = {skb_push|[__]skb_pull}()

The next ones will handle the slightly more "complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:15 -07:00
Arnaldo Carvalho de Melo eddc9ec53b [SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:10 -07:00
Arnaldo Carvalho de Melo bbe735e424 [SK_BUFF]: Introduce skb_network_offset()
For the quite common 'skb->nh.raw - skb->data' sequence.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:58 -07:00
Arnaldo Carvalho de Melo c1d2bbe1cd [SK_BUFF]: Introduce skb_reset_network_header(skb)
For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can
later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple case, next will handle the slightly more
"complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:46 -07:00
Arnaldo Carvalho de Melo 459a98ed88 [SK_BUFF]: Introduce skb_reset_mac_header(skb)
For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can
later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple case, next will handle the slightly more
"complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:32 -07:00
Arnaldo Carvalho de Melo 4c13eb6657 [ETH]: Make eth_type_trans set skb->dev like the other *_type_trans
One less thing for drivers writers to worry about.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:30 -07:00
Divy Le Ray 8ac3ba68e2 cxgb3 - detect NIC only adapters
Differentiate NIC only adapters from RNICs.
Initialize offload capabilities for RNICs only.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-03 22:31:09 -04:00
Divy Le Ray e0994eb1d9 cxgb3 - Feed Rx free list with pages
Populate Rx free list with pages.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-27 04:27:12 -05:00
Divy Le Ray bae73f4447 cxgb3 - Recovery from HW starvation of response queue entries.
Improve the traffic recovery after the HW ran out of response queue entries.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-27 04:27:12 -05:00
Divy Le Ray 99d7cf30b9 cxgb3 - Unmap offload packets when they are freed
Offload packets may be DMAed long after their SGE Tx descriptors are done
so they must remain mapped until they are freed rather than until their
descriptors are freed.  Unmap such packets through an skb destructor.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-27 04:27:12 -05:00
Divy Le Ray 1d68e93d65 cxgb3 - Add dual licensing
Dual licensing, needed for OFED 1.2

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-05 16:58:51 -05:00
Divy Le Ray f2aa52086f cxgb3 - Remove BUG_ON from t3b_intr_napi
In some cases, SG_DATA_INTR won't clear on read and the following
interrupt may cause us to assert because NAPI is already scheduled.
Remove the assertion, NAPI can handle attempts to rearm it while
it's already scheduled.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-05 16:58:51 -05:00
Divy Le Ray 6195c71d65 cxgb3 - remove SW Tx credits coalescing
Remove tx credit coalescing done in SW.
The HW is caring care of it already.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-05 16:58:50 -05:00
Divy Le Ray 14ab989245 cxgb3 - bind qsets on multiport adapter
Inform FW about the queue set->interface mapping.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-05 16:58:50 -05:00
Divy Le Ray 4d22de3e6c Add support for the latest 1G/10G Chelsio adapter, T3.
This driver is required by the Chelsio T3 RDMA driver posted by
Steve Wise.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-05 16:58:46 -05:00