PCI drivers that implement the io_error_detected callback
should return PCI_ERS_RESULT_DISCONNECT if the state
passed in is pci_channel_io_perm_failure. This state is
not checked in many of the network drivers.
The patch fixes the omission in the e1000 driver.
Based on Mike Mason's similar patch for e1000e.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
CC: Mike Mason <mmlnx@us.ibm.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is inspired by patch recently posted by Johannes Berg. Basically what
my patch does is to group list and a count of addresses into newly introduced
structure netdev_hw_addr_list. This brings us two benefits:
1) struct net_device becames a bit nicer.
2) in the future there will be a possibility to operate with lists independently
on netdevices (with exporting right functions).
I wanted to introduce this patch before I'll post a multicast lists conversion.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
drivers/net/bnx2.c | 4 +-
drivers/net/e1000/e1000_main.c | 4 +-
drivers/net/ixgbe/ixgbe_main.c | 6 +-
drivers/net/mv643xx_eth.c | 2 +-
drivers/net/niu.c | 4 +-
drivers/net/virtio_net.c | 10 ++--
drivers/s390/net/qeth_l2_main.c | 2 +-
include/linux/netdevice.h | 17 +++--
net/core/dev.c | 130 ++++++++++++++++++--------------------
9 files changed, 89 insertions(+), 90 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_dma_unmap() is quite expensive for small packets,
because we use two different cache lines from skb_shared_info.
One to access nr_frags, one to access dma_maps[0]
Instead of dma_maps being an array of MAX_SKB_FRAGS + 1 elements,
let dma_head alone in a new dma_head field, close to nr_frags,
to reduce cache lines misses.
Tested on my dev machine (bnx2 & tg3 adapters), nice speedup !
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch to fix bad length checking in e1000. E1000 by default does two
things:
1) Spans rx descriptors for packets that don't fit into 1 skb on recieve
2) Strips the crc from a frame by subtracting 4 bytes from the length prior to
doing an skb_put
Since the e1000 driver isn't written to support receiving packets that span
multiple rx buffers, it checks the End of Packet bit of every frame, and
discards it if its not set. This places us in a situation where, if we have a
spanning packet, the first part is discarded, but the second part is not (since
it is the end of packet, and it passes the EOP bit test). If the second part of
the frame is small (4 bytes or less), we subtract 4 from it to remove its crc,
underflow the length, and wind up in skb_over_panic, when we try to skb_put a
huge number of bytes into the skb. This amounts to a remote DOS attack through
careful selection of frame size in relation to interface MTU. The fix for this
is already in the e1000e driver, as well as the e1000 sourceforge driver, but no
one ever pushed it to e1000. This is lifted straight from e1000e, and prevents
small frames from causing the underflow described above
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch converts unicast address list to standard list_head using
previously introduced struct netdev_hw_addr. It also relaxes the
locking. Original spinlock (still used for multicast addresses) is not
needed and is no longer used for a protection of this list. All
reading and writing takes place under rtnl (with no changes).
I also removed a possibility to specify the length of the address
while adding or deleting unicast address. It's always dev->addr_len.
The convertion touched especially e1000 and ixgbe codes when the
change is not so trivial.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
drivers/net/bnx2.c | 13 +--
drivers/net/e1000/e1000_main.c | 24 +++--
drivers/net/ixgbe/ixgbe_common.c | 14 ++--
drivers/net/ixgbe/ixgbe_common.h | 4 +-
drivers/net/ixgbe/ixgbe_main.c | 6 +-
drivers/net/ixgbe/ixgbe_type.h | 4 +-
drivers/net/macvlan.c | 11 +-
drivers/net/mv643xx_eth.c | 11 +-
drivers/net/niu.c | 7 +-
drivers/net/virtio_net.c | 7 +-
drivers/s390/net/qeth_l2_main.c | 6 +-
drivers/scsi/fcoe/fcoe.c | 16 ++--
include/linux/netdevice.h | 18 ++--
net/8021q/vlan.c | 4 +-
net/8021q/vlan_dev.c | 10 +-
net/core/dev.c | 195 +++++++++++++++++++++++++++-----------
net/dsa/slave.c | 10 +-
net/packet/af_packet.c | 4 +-
18 files changed, 227 insertions(+), 137 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
Second round of drivers for Gb cards (and NIU one I forgot in the 10GB round)
Now that core network takes care of trans_start updates, dont do it
in drivers themselves, if possible. Drivers can avoid one cache miss
(on dev->trans_start) in their start_xmit() handler.
Exceptions are NETIF_F_LLTX drivers
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
after the recent changes to wired drivers to use only
netif_carrier_off the driver can have outstanding tx work to
complete that will never complete once link is down. Since the
intel hardware will hold this tx work forever, the driver
notices a tx timeout condition internally and might try
to instigate printk and reset of the part with a
netif_stop_queue, which doesn't work because link is down.
Don't bother arming to tx hang detection when link is down.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a recent fix to e1000 (commit 15b2bee2) caused KVM/QEMU/VMware based
virtualized e1000 interfaces to begin failing when resetting.
This is because the driver in a virtual environment doesn't
get to run instructions *AT ALL* when an interrupt is asserted.
The interrupt code runs immediately and this recent bug fix
allows an interrupt to be possible when the interrupt handler
will reject it (due to the new code), when being called from
any path in the driver that holds the E1000_RESETTING flag.
the driver should use the __E1000_DOWN flag instead of the
__E1000_RESETTING flag to prevent interrupt execution
while reconfiguring the hardware.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It was pointed out that the Intel wired ethernet drivers do not need to
wake the tx queue since netif_carrier_on/off will take care of the qdisc
management in order to guarantee the correct handling of the transmit
routine enable state.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Andrew Lutomirski <amluto@gmail.com>
All the intel wired ethernet drivers were calling netif_carrier_off
and netif_stop_queue (or variants) before calling register_netdevice
This is incorrect behavior as was pointed out by davem, and causes
ifconfig and friends to report a strange state before first link
after the driver was loaded.
This apparently confused *some* versions of networkmanager.
Andy tested this for e1000e and confirmed it was working for him.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reported-by: Andrew Lutomirski <amluto@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the e1000 transmit cleanup inner loop exited early, then
cleaned might not be true. This could cause tx hangs or other
badness. Use count to track the total number of descriptors
cleaned instead of basing a tx queue restart off of a temporary
working state variable.
This code now makes the flow the same for e1000/e1000e/igb/ixgbe
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prevent e1000 from putting the adapter into D3 during shutdown except when
we're going to power off the system, since doing that may generally cause
problems with kexec to happen (such problems were observed for igb and
forcedeth). For this purpose seperate e1000_shutdown() from e1000_suspend()
and use the appropriate PCI PM callbacks in both of them.
Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e1000/e1000e compile report a possible unused variable, fix
that for now. Shortly after this a small refactor and bug
fix will follow in the same code.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
e1000 (and e1000e, igb, ixgbe, ixgb) all do a series of
operations each time a multicast address is added. The flow goes
something like
1) stack adds one multicast address
2) stack passes whole current list of unicast and multicast
addresses to driver
3) driver clears entire list in hardware
4) driver programs each multicast address using iomem in a loop
This was causing multicast packets to be lost during the
reprogramming process.
reference with test program:
http://kerneltrap.org/mailarchive/linux-netdev/2009/3/14/5160514/thread
Thanks to Dave Boutcher for his report and test program.
This driver fix prepares an array all at once in memory and
programs it in one shot to the hardware, not requiring an "erase"
cycle.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
this is in regards to
http://bugzilla.kernel.org/show_bug.cgi?id=12876
where it appears that e1000 can leave its interrupt enabled after
exiting the driver. Fix the bug by making the interrupt enable
paths more aware of the driver exiting.
Thanks to Alan Cox for the poke and initial investigation.
CC: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tx cleanup routine was stopping after 64 packets and this was causing
issues resulting in the ring not being completely cleaned.
This change updates the driver to clean the entire ring and if it doesn't
it then will retry on the next pass.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the dma mapping to better support
skb_dma_map/skb_dma_unmap and addresses and redefines the tx hang logic to
be based off of time stamp instead of if the dma field is populated
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is in reference to the issue shown in kerneloops (search e1000 unmap)
The e1000 transmit code was calling pci_unmap_page on dma handles that it
might have called pci_map_single on.
Same bug as e1000e
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removing the unused macro PAGE_USE_COUNT(), since there is no more reference
to it. The last reference was removed by Jesse's commit number
630b25cdf4.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On machine were no IO ports are assigned the call
to pci_enable_device() will fail, even if need_ioport
is false, we need to use pci_enable_device_mem() here.
Signed-off-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Base versions handle constant folding now.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A nasty bug was found where an MTU change (or anything else that caused a
reset) could race with the interrupt code. The interrupt code was entered
by a shared interrupt during the MTU change.
This change prevents the interrupt code from running while the driver is in
the middle of its reset path.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LLTX is deprecated, don't use it. This completes the removal of LLTX from
the Intel Network drivers.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following the removal of the unused struct net_device * parameter from
the NAPI functions named *netif_rx_* in commit 908a7a1, they are
exactly equivalent to the corresponding *napi_* functions and are
therefore redundant.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the napi api was changed to separate its 1:1 binding to the net_device
struct, the netif_rx_[prep|schedule|complete] api failed to remove the now
vestigual net_device structure parameter. This patch cleans up that api by
properly removing it..
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit bea3348eef
"[NET]: Make NAPI polling independent of struct net_device objects."
made NAPI polling to be independent of net_device.
So e1000_adapter->polling_netdev is no longer used.
Kill it.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The system log messages created on a link status change need to follow a
specific format to work with tools some customers use. This also makes
the messages consistant with other Intel driver link messages.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moves neigh_setup and hard_start_xmit into the network device ops
structure. For bisection, fix all the previously converted drivers as well.
Bonding driver took the biggest hit on this.
Added a prefetch of the hard_start_xmit in the fast path to try and reduce
any impact this would have.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert to new network device ops interface.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since dev->power.should_wakeup bit is used by the PCI core to
decide whether the device should wake up the system from sleep
states, set/unset this bit whenever WOL is enabled/disabled using
e1000_set_wol(). Accordingly, use device_can_wakeup() for checking
if wake-up is supported by the device.
Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have some reasons to kill netdev->priv:
1. netdev->priv is equal to netdev_priv().
2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously
netdev_priv() is more flexible than netdev->priv.
But we cann't kill netdev->priv, because so many drivers reference to it
directly.
This patch is a safe convert for netdev->priv to netdev_priv(netdev).
Since all of the netdev->priv is only for read.
But it is too big to be sent in one mail.
I split it to 4 parts and make every part smaller than 100,000 bytes,
which is max size allowed by vger.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The generic packet receive code takes care of setting
netdev->last_rx when necessary, for the sake of the
bonding ARP monitor.
Drivers need not do it any more.
Some cases had to be skipped over because the drivers
were making use of the ->last_rx value themselves.
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the newly introduced pci_ioremap_bar() function in drivers/net.
pci_ioremap_bar() just takes a pci device and a bar number, with the goal
of making it really hard to get wrong, while also having a central place
to stick sanity checks.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This converts pretty much everything to print_mac. There were
a few things that had conflicts which I have just dropped for
now, no harm done.
I've built an allyesconfig with this and looked at the files
that weren't built very carefully, but it's a huge patch.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes e1000 to set vlan_features so TSO and CSUM
offload can be used by VLAN devices, similar as with the other
Intel drivers.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When offloading transmit checksums only, the driver was not
correctly configuring the hardware to handle the case of a zero
checksum. For UDP the correct behavior is to leave it alone, but
for tcp the checksum must be changed from 0x0000 to 0xFFFF. The
hardware takes care of this case but only if it is told the
packet is tcp.
same patch as e1000e
Signed-off-by: Dave Graham <david.graham@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the e1000/e1000e split, no hardware supported by e1000
supports packet split, just remove the Kconfig option and associated
code from the driver.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Andrey reports e1000 corruption, and that a patch in vmware's ESX fixed
it.
The EEPROM corruption is triggered by concurrent access of the EEPROM
read/write. Putting a lock around it solve the problem.
[akpm@linux-foundation.org: use DEFINE_SPINLOCK to avoid confusing lockdep]
Signed-off-by: Christopher Li <chrisl@vmware.com>
Reported-by: Andrey Borzenkov <arvidjaar@mail.ru>
Cc: Zach Amsden <zach@vmware.com>
Cc: Pratap Subrahmanyam <pratap@vmware.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
Cc: PJ Waskiewicz <peter.p.waskiewicz.jr@intel.com>
Cc: John Ronciak <john.ronciak@intel.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Here's the patch. It shrinks the stack from 1152 bytes to 192 bytes (the
first version, that only did the e1000_option part, got it down to 600
bytes). About half comes from not using multiple "e1000_option"
structures, the other half comes from turning the "e1000_opt_list[]"
arrays into "static const" instead, so that gcc doesn't copy them onto the
stack.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reveiewed-by: Auke Kok <auke-jan.h.kok@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>