Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
architecture does:
This enables us to cleanly fix the Calgary IOMMU issue that some devices
are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423).
I think that per-device dma_mapping_ops support would be also helpful for
KVM people to support PCI passthrough but Andi thinks that this makes it
difficult to support the PCI passthrough (see the above thread). So I
CC'ed this to KVM camp. Comments are appreciated.
A pointer to dma_mapping_ops to struct dev_archdata is added. If the
pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's
NULL, the system-wide dma_ops pointer is used as before.
If it's useful for KVM people, I plan to implement a mechanism to register
a hook called when a new pci (or dma capable) device is created (it works
with hot plugging). It enables IOMMUs to set up an appropriate
dma_mapping_ops per device.
The major obstacle is that dma_mapping_error doesn't take a pointer to the
device unlike other DMA operations. So x86 can't have dma_mapping_ops per
device. Note all the POWER IOMMUs use the same dma_mapping_error function
so this is not a problem for POWER but x86 IOMMUs use different
dma_mapping_error functions.
The first patch adds the device argument to dma_mapping_error. The patch
is trivial but large since it touches lots of drivers and dma-mapping.h in
all the architecture.
This patch:
dma_mapping_error() doesn't take a pointer to the device unlike other DMA
operations. So we can't have dma_mapping_ops per device.
Note that POWER already has dma_mapping_ops per device but all the POWER
IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device
argument.
[akpm@linux-foundation.org: fix sge]
[akpm@linux-foundation.org: fix svc_rdma]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix bnx2x]
[akpm@linux-foundation.org: fix s2io]
[akpm@linux-foundation.org: fix pasemi_mac]
[akpm@linux-foundation.org: fix sdhci]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix sparc]
[akpm@linux-foundation.org: fix ibmvscsi]
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
iph->tot_len is stored in network byte order, so access it using
ntohs(). This doesn't have any real world impact on pasemi_mac, since
the device only exists as part of a big-endian system-on-chip, but
fixing this gets rid of a sparse warning and avoids having a bad example
in the tree.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (202 commits)
[POWERPC] Fix compile breakage for 64-bit UP configs
[POWERPC] Define copy_siginfo_from_user32
[POWERPC] Add compat handler for PTRACE_GETSIGINFO
[POWERPC] i2c: Fix build breakage introduced by OF helpers
[POWERPC] Optimize fls64() on 64-bit processors
[POWERPC] irqtrace support for 64-bit powerpc
[POWERPC] Stacktrace support for lockdep
[POWERPC] Move stackframe definitions to common header
[POWERPC] Fix device-tree locking vs. interrupts
[POWERPC] Make pci_bus_to_host()'s struct pci_bus * argument const
[POWERPC] Remove unused __max_memory variable
[POWERPC] Simplify xics direct/lpar irq_host setup
[POWERPC] Use pseries_setup_i8259_cascade() in pseries_mpic_init_IRQ()
[POWERPC] Turn xics_setup_8259_cascade() into a generic pseries_setup_i8259_cascade()
[POWERPC] Move xics_setup_8259_cascade() into platforms/pseries/setup.c
[POWERPC] Use asm-generic/bitops/find.h in bitops.h
[POWERPC] 83xx: mpc8315 - fix USB UTMI Host setup
[POWERPC] 85xx: Fix the size of qe muram for MPC8568E
[POWERPC] 86xx: mpc86xx_hpcn - Temporarily accept old dts node identifier.
[POWERPC] 86xx: mark functions static, other minor cleanups
...
Having the id field be an int was making more complex bus topologies
excessively difficult. For now, just convert it to a string, and
change all instances of "bus->id = val" to
snprintf(id, MII_BUS_ID_LEN, "%x", val).
Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add netpoll support to allow use of netconsole.
Signed-off-by: Nate Case <ncase@xes-inc.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Fix a couple of corner cases around interface up/down when jumbo frames are
configured. Resources weren't always freed and reallocated properly.
Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Ethtool support will handle the runtime toggling, but we do quite a bit
better with it on by default so just leave it on for now.
Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
First cut at jumbo frame support. To support large MTU, one or several
separate channels must be allocated to calculate the TCP/UDP checksum
separately, since the mac lacks enough buffers to hold a whole packet
while it's being calculated.
Furthermore, it seems that a single function channel is not quite
enough to feed one of the 10Gig links, so allocate two channels for
XAUI interfaces.
Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Also stop both rx and tx sections before changing the configuration of
the dma device during init.
Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Turns out we never disable the interface. It doesn't really cause
any problems since the channel is off, but it's still better to do it
this way.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently keeping it at 1500 bytes or below since jumbo frames need
special checksum offload on TX.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Straightforward. It used to be hardcoded and impossible to override
with ifconfig.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
pasemi_mac: Don't enable RX/TX without a link (if possible)
Don't enable RX/TX of packets until we have a link, since there's a chance
we'll just get RX frame errors, etc.
The case where we don't have a PHY we can't do much about: Just enable
it and deal with errors as they come in.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: Print warning when not attaching to a PHY
Print a warning on the console when not connecting to a phy for an interface.
It turns out to be a pretty common problem when someone gets the MDIO info
wrong in their device tree, resulting in the macs running at a fixed 1Gbit FD.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: Remove SKB copy/recycle logic
It doesn't really buy us much, since copying is about as expensive
as the allocation in the first place. Just remove it for now.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: SKB unmap optimization
Avoid touching skb_shinfo() in the unmap path, since it turns out to
normally cause cache misses and delays. instead, save number of fragments
in the TX_RING_INFO structures since that's all that's needed anyway.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: Software-based LRO support
Implement LRO for pasemi_mac. Pretty straightforward.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: Improve RX interrupt mitigation
Currently the receive side interrupts will go off on the reception of
a packet, NAPI will poll the ring and keep polling as long as there's
a decent amount of packets to receive.
This is less than optimal, especially for LRO where it's better if we
have a more substantial amount of packets to process at once, to get
the real LRO benefits.
So, set the count threshold to a higher value and use the timeout feature
that will give us an interrupt even if not enough packets have come in
to set off the count threshold.
FIXME: It'd be real nice to have ethtool support for users to tune this
at runtime.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: Fix TX cleaning
This is a bit awkward. We don't have a timer-delayed interrupt on TX
complete, but we have a count threshold. So set that reasonably high
(32 packets), and schedule the NAPI poll when it goes off. Also bump a
regular timer that will take care of rotting packets for the last 1..31
ones in case we don't trigger a TX interrupt (and there's no RX activity
that would otherwise trigger the poll).
The longer-term fix is to separate TX from RX NAPI and do two separate
poll loops.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: performance tweaks
* Seems like we do better with a smaller RX ring, probably because chances of
still having the SKB cached are better
* Const-ify variables to get better code generation and fewer reloads
* Move prefetching around a little, and try to prefetch the whole SKB
* Set NETIF_F_HIGHDMA
* Misc other minor tweaks
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: Convert to new dma library
Convert the pasemi_mac driver to the new platform global DMA manaagement
library. This also does a couple of other minor cleanups w.r.t. channel
management.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: Move register definitions to include/asm-powerpc
Move the common register formats and descriptor layouts from
drivers/net/pasemi_mac.h to include/asm-poewrpc/pasemi_dma.h
Previously only the ethernet driver was using them, but other drivers
are coming up that will also use them, so it makes sense to share the
constants.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: RX/TX ring management cleanup
Prepare a bit for supporting multiple TX queues by cleaning up some
of the ring management and shuffle things around a bit.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Turns out we're freeing the skb when we detect CRC error, but we're
not clearing out info->skb. We could either clear it and have the stack
reallocate it, or just leave it and the rx ring refill code will reuse
the one that was allocated.
Reusing a freed skb obviously caused some nasty crashes of various kind,
as reported by Brent Baude and David Woodhouse.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Make sure we don't feed packets with bad CRC up the network stack,
and discount the packet length as reported from the MAC for the CRC
field.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Don't use the "replace source address with local MAC address" bits, since
it causes problems on some variations of the hardware due to an erratum.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add missing &:
drivers/net/pasemi_mac.c: In function 'pasemi_mac_clean_rx':
drivers/net/pasemi_mac.c:553: warning: passing argument 1 of 'prefetch'
makes pointer from integer without a cast
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: enable iommu support
Enable IOMMU support for pasemi_mac, but avoid using it on non-partitioned
systems for performance reasons.
The user can override this by selecting the PPC_PASEMI_IOMMU_DMA_FORCE
configuration option.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: use buffer index pointer in clean_rx()
Use the new features in B0 for buffer ring index on the receive side. This
means we no longer have to search in the ring for where the buffer
came from.
Also cleanup the RX cleaning side a little, while I was at it.
Note: Pre-B0 hardware is no longer supported, and needs a pile of other
workarounds that are not being submitted for mainline inclusion. So the
fact that this breaks old hardware is not a problem at this time.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: clear out old errors on interface open
Clear out any pending errors when an interface is brought up. Since the bits
are sticky, they might be from interface shutdown time after firmware has
used it, etc.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: update todo list
Remove some stale todo items that have been taken care of. Add a couple
of upcoming ones.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: further performance tweaks
Misc driver tweaks for pasemi_mac:
* Increase ring size (really needed mostly on 10G)
* Take out an unneeded barrier
* Move around a few prefetches and reorder a few calls
* Don't try to clean on full tx buffer, just let things
take their course and stop the queue directly
* Avoid filling on the same line as the interface is
working on to reduce cache line bouncing
* Avoid unneeded clearing of software state (and make the
interface shutdown code handle it)
* Fix up some of the tx ring wrap logic.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: add local skb alignment
Add local SKB alignment to pasemi_mac, since ppc64 in general has it at 0
because of design flaws in some of the IBM server bridge chips. However,
for PWRficient doing the unaligned copies is more expensive than doing
unaligned DMA so make sure the data is aligned instead.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: workaround for erratum 5971
Implement workarounds for erratum 5971, where L2 hints aren't considered
properly unless the way hint is enabled on the interface. Since L2 isn't
setup to dedicate a way to headers, we need to reset the packet count
by hand so it won't run out of credits.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: implement sg support
Implement SG support for pasemi_mac
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: rework ring management
Rework ring management, switching to an opaque ring format instead of
the struct-based descriptor+pointer setup, since it will be needed for
SG support.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: fix bug in receive buffer dma mapping
skb->len isn't actually set to the size of the allocated skb, so don't
try to use it when figuring out how much to map.
(This hasn't surfaced as a real bug because we effectively disable
translation for the interface, but it still needs fixing for the future)
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: basic error checking
Add some rudimentary error checking to pasemi_mac.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: pass in count of buffers to replenish rx ring with
Refactor replenish_rx_ring to take an argument for how many entries to
fill. Since it's normally available from where it's called anyway, this
is just simpler. It also removes the awkward logic to try to figure out
if we're filling for the first time or not.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: don't enable rx before there are buffers on the ring
Reorder initialization of the DMA channels and the interface. Before there
was a time window when the interface was enabled before DMA was enabled.
Also, now there will always be RX buffers available at the time the
MAC interface is enabled, to avoid temporary out-of-buffer errors for the
very first packets (on busy networks).
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: flags as passed to spin_*_irqsave() should be unsigned long.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pasemi_mac: set interface speed correctly on XAUI ports
Set interface speed for XAUI to 10G per default, not 1G.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
drivers/net/pasemi_mac.c: At top level:
drivers/net/pasemi_mac.c:89: warning: 'read_iob_reg' defined but not used
Signed-off-by: Satyam Sharma <satyam@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We now have struct net_device_stats embedded in struct net_device,
and the default ->get_stats() hook does the obvious thing for us.
Run through drivers/net/* and remove the driver-local storage of
statistics, and driver-local ->get_stats() hook where applicable.
This was just the low-hanging fruit in drivers/net; plenty more drivers
remain to be updated.
[ Resolved conflicts with napi_struct changes and fix sunqe build
regression... -DaveM ]
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's been a useless no-op for long enough in 2.6 so I figured it's time to
remove it. The number of people that could object because they're
maintaining unified 2.4 and 2.6 drivers is probably rather small.
[ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ]
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unfortunately there's no timeout for how long a packet can sit on
the TX ring after completion before an interrupt is generated, and
we want to have a threshold that's larger than one packet per interrupt.
So we have to have a timer that occasionally cleans the TX ring even
though there hasn't been an interrupt. Instead of setting up a dedicated
timer for this, just clean it in the NAPI poll routine instead.
[ Resolved conflicts with napi_struct changes... -DaveM ]
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable LLTX on pasemi_mac: we're already doing sufficient locking
in the driver to enable it.
[ Resolved merge conflicts with napi_struct changes... -DaveM ]
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>